Comparative Analysis of Language Models on Augmented Low-Resource Datasets for Application in Question & Answering Systems
dc.contributor.advisor | Erechtchoukova, Marina G. | |
dc.contributor.author | Ranjbargol, Seyedehsamaneh | |
dc.date.accessioned | 2024-11-07T11:01:54Z | |
dc.date.available | 2024-11-07T11:01:54Z | |
dc.date.copyright | 2024-05-24 | |
dc.date.issued | 2024-11-07 | |
dc.date.updated | 2024-11-07T11:01:53Z | |
dc.degree.discipline | Information Systems and Technology | |
dc.degree.level | Master's | |
dc.degree.name | MA - Master of Arts | |
dc.description.abstract | This thesis aims to advance natural language processing (NLP) in question-answering (QA) systems for low-resource domains. The research presents a comparative analysis of several pre-trained language models, highlighting their performance enhancements when fine-tuned with augmented data to address several critical questions, such as the effectiveness of synthetic data and the efficiency of data augmentation techniques for improving QA systems in specialized contexts. The study focuses on developing a hybrid QA framework that can be integrated with a cloud-based information system. This approach refines the functionality and applicability of QA systems, boosting their performance in low-resource settings by using targeted fine-tuning and advanced transformer models. The successful application of this method demonstrates the significant potential for specialized, AI-driven QA systems to adapt and thrive in specific environments. | |
dc.identifier.uri | https://hdl.handle.net/10315/42401 | |
dc.language | en | |
dc.rights | Author owns copyright, except where explicitly noted. Please contact the author directly with licensing requests. | |
dc.subject | Information technology | |
dc.subject | Computer science | |
dc.subject.keywords | Natural language processing | |
dc.subject.keywords | NLP | |
dc.subject.keywords | Question answering systems | |
dc.subject.keywords | QA systems | |
dc.subject.keywords | Low-resource domains | |
dc.subject.keywords | Pre-trained language models | |
dc.subject.keywords | Fine-tuning | |
dc.subject.keywords | Data augmentation | |
dc.subject.keywords | Transformer models | |
dc.subject.keywords | BERT | |
dc.subject.keywords | Sentence transformers | |
dc.subject.keywords | Cosine similarity | |
dc.subject.keywords | Machine reading comprehension | |
dc.subject.keywords | Synthetic data | |
dc.subject.keywords | Active learning | |
dc.subject.keywords | Contextual synonym substitution | |
dc.subject.keywords | Hugging Face | |
dc.subject.keywords | Cloud-based applications | |
dc.subject.keywords | Domain-specific QA | |
dc.subject.keywords | Information retrieval | |
dc.subject.keywords | Deep learning | |
dc.subject.keywords | Artificial intelligence | |
dc.subject.keywords | AI | |
dc.subject.keywords | Neural networks | |
dc.subject.keywords | Semantic analysis | |
dc.subject.keywords | Language models | |
dc.subject.keywords | Machine learning | |
dc.subject.keywords | AI-driven QA systems | |
dc.title | Comparative Analysis of Language Models on Augmented Low-Resource Datasets for Application in Question & Answering Systems | |
dc.type | Electronic Thesis or Dissertation |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Ranjbargol_Seyedehsamaneh_2024_Masters.pdf
- Size:
- 2.66 MB
- Format:
- Adobe Portable Document Format