Comparative Analysis of Language Models on Augmented Low-Resource Datasets for Application in Question & Answering Systems

dc.contributor.advisorErechtchoukova, Marina G.
dc.contributor.authorRanjbargol, Seyedehsamaneh
dc.date.accessioned2024-11-07T11:01:54Z
dc.date.available2024-11-07T11:01:54Z
dc.date.copyright2024-05-24
dc.date.issued2024-11-07
dc.date.updated2024-11-07T11:01:53Z
dc.degree.disciplineInformation Systems and Technology
dc.degree.levelMaster's
dc.degree.nameMA - Master of Arts
dc.description.abstractThis thesis aims to advance natural language processing (NLP) in question-answering (QA) systems for low-resource domains. The research presents a comparative analysis of several pre-trained language models, highlighting their performance enhancements when fine-tuned with augmented data to address several critical questions, such as the effectiveness of synthetic data and the efficiency of data augmentation techniques for improving QA systems in specialized contexts. The study focuses on developing a hybrid QA framework that can be integrated with a cloud-based information system. This approach refines the functionality and applicability of QA systems, boosting their performance in low-resource settings by using targeted fine-tuning and advanced transformer models. The successful application of this method demonstrates the significant potential for specialized, AI-driven QA systems to adapt and thrive in specific environments.
dc.identifier.urihttps://hdl.handle.net/10315/42401
dc.languageen
dc.rightsAuthor owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subjectInformation technology
dc.subjectComputer science
dc.subject.keywordsNatural language processing
dc.subject.keywordsNLP
dc.subject.keywordsQuestion answering systems
dc.subject.keywordsQA systems
dc.subject.keywordsLow-resource domains
dc.subject.keywordsPre-trained language models
dc.subject.keywordsFine-tuning
dc.subject.keywordsData augmentation
dc.subject.keywordsTransformer models
dc.subject.keywordsBERT
dc.subject.keywordsSentence transformers
dc.subject.keywordsCosine similarity
dc.subject.keywordsMachine reading comprehension
dc.subject.keywordsSynthetic data
dc.subject.keywordsActive learning
dc.subject.keywordsContextual synonym substitution
dc.subject.keywordsHugging Face
dc.subject.keywordsCloud-based applications
dc.subject.keywordsDomain-specific QA
dc.subject.keywordsInformation retrieval
dc.subject.keywordsDeep learning
dc.subject.keywordsArtificial intelligence
dc.subject.keywordsAI
dc.subject.keywordsNeural networks
dc.subject.keywordsSemantic analysis
dc.subject.keywordsLanguage models
dc.subject.keywordsMachine learning
dc.subject.keywordsAI-driven QA systems
dc.titleComparative Analysis of Language Models on Augmented Low-Resource Datasets for Application in Question & Answering Systems
dc.typeElectronic Thesis or Dissertation

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Ranjbargol_Seyedehsamaneh_2024_Masters.pdf
Size:
2.66 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
license.txt
Size:
1.87 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
YorkU_ETDlicense.txt
Size:
3.39 KB
Format:
Plain Text
Description: