Search beyond traditional probabilistic information retrieval

dc.contributor.advisorHuang, Jimmy
dc.creatorHu, Qinmin
dc.date.accessioned2016-09-21T16:22:45Z
dc.date.available2016-09-21T16:22:45Z
dc.date.copyright2009-07
dc.degree.disciplineComputer Science & Engineering
dc.degree.levelDoctoral
dc.degree.namePhD - Doctor of Philosophy
dc.description.abstract"This thesis focuses on search beyond probabilistic information retrieval. Three ap- proached are proposed beyond the traditional probabilistic modelling. First, term associ- ation is deeply examined. Term association considers the term dependency using a factor analysis based model, instead of treating each term independently. Latent factors, con- sidered the same as the hidden variables of ""eliteness"" introduced by Robertson et al. to gain understanding of the relation among term occurrences and relevance, are measured by the dependencies and occurrences of term sequences and subsequences. Second, an entity-based ranking approach is proposed in an entity system named ""EntityCube"" which has been released by Microsoft for public use. A summarization page is given to summarize the entity information over multiple documents such that the truly relevant entities can be highly possibly searched from multiple documents through integrating the local relevance contributed by proximity and the global enhancer by topic model. Third, multi-source fusion sets up a meta-search engine to combine the ""knowledge"" from different sources. Meta-features, distilled as high-level categories, are deployed to diversify the baselines. Three modified fusion methods are employed, which are re- ciprocal, CombMNZ and CombSUM with three expanded versions. Through extensive experiments on the standard large-scale TREC Genomics data sets, the TREC HARD data sets and the Microsoft EntityCube Web collections, the proposed extended models beyond probabilistic information retrieval show their effectiveness and superiority."
dc.identifier.urihttp://hdl.handle.net/10315/32366
dc.rightsAuthor owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subject.keywordsSearch
dc.subject.keywordsProbabilistic information retrieval
dc.subject.keywordsTerm assocciation
dc.subject.keywordsEntity-based ranking
dc.subject.keywordsMulti-source fusion
dc.titleSearch beyond traditional probabilistic information retrieval
dc.typeElectronic Thesis or Dissertation

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Hu_Qinmin_2013_PhD.pdf
Size:
6.37 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
license.txt
Size:
1.83 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
YorkU_ETDlicense.txt
Size:
3.36 KB
Format:
Plain Text
Description: