Enhancing General Language Models for Biomedical Test Retrieval via Diversified Prior Knowledge

dc.contributor.advisorHuang, Jimmy
dc.contributor.authorHuang, Yizheng
dc.date.accessioned2023-12-08T14:42:10Z
dc.date.available2023-12-08T14:42:10Z
dc.date.issued2023-12-08
dc.date.updated2023-12-08T14:42:09Z
dc.degree.disciplineInformation Systems and Technology
dc.degree.levelMaster's
dc.degree.nameMA - Master of Arts
dc.description.abstractThe thesis introduces the Diversified Prior Knowledge Enhanced General Language Model (DPK-GLM) to improve the efficacy of general language models in biomedical Information Retrieval (IR). General language models often struggle with biomedical data due to its specialized terminology and the need for precise matching. DPK-GLM tackles these challenges by integrating domain-specific knowledge, thereby enhancing the model's ability to understand and process biomedical information. The framework comprises three core components. The first, Knowledge-based Query Expansion, leverages authoritative biomedical databases to enrich search queries with domain-specific entities. The second, Aspect-based Filter, identifies documents that are highly relevant to the query. The third, Diversity-based Score Reweighting, re-ranks these filtered documents by combining similarity and diversity scores, yielding more accurate results. Experimental tests on public biomedical IR datasets confirm that DPK-GLM significantly improves retrieval performance.
dc.identifier.urihttps://hdl.handle.net/10315/41736
dc.languageen
dc.rightsAuthor owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subjectInformation technology
dc.subjectArtificial intelligence
dc.subjectBioinformatics
dc.subject.keywordsBiomedical information retrieval
dc.subject.keywordsText retrieval
dc.subject.keywordsRanking
dc.subject.keywordsDeep learning
dc.titleEnhancing General Language Models for Biomedical Test Retrieval via Diversified Prior Knowledge
dc.typeElectronic Thesis or Dissertation

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Huang_Yizheng_2023_Masters.pdf
Size:
3.1 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
license.txt
Size:
1.87 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
YorkU_ETDlicense.txt
Size:
3.39 KB
Format:
Plain Text
Description: