Statistical Modeling to Information Retrieval for Searching from Big Text Data and Higher Order Inference for Reliability

Zhou, Xiaofeng

Statistical Modeling to Information Retrieval for Searching from Big Text Data and Higher Order Inference for Reliability

Files

Zhou_Xiaofeng_2014_PhD.pdf (940.82 KB)

Date

2015-01-26

Authors

Zhou, Xiaofeng

Abstract

This thesis examined two research projects: probabilistic information retrieval modeling and third-order inference on reliability. In the first part of this dissertation, two research topics in the information retrieval are carried out and experimented on large-scale text data set. First, we conduct an in-depth study of relationship between information of document length and document relevance to user need. Two statistical methods are proposed which incorporates document length as a substantial weighting factor to achieve higher retrieval performance. Second, we utilize the property of survival function to propose a cost-based re-ranking method to promote ranking diversity for biomedical information retrieval, and to model the proximity between query terms to improve retrieval performance. Through extensive experiments on standard TREC collections, our proposed models perform significantly better than the classical probabilistic information retrieval models. In the second part of this dissertation, a small sample asymptotic method is proposed for higher order inference in the stress-strength reliability model, R=P(Y<X), where X and Y are independently distributed. A penalized likelihood method is proposed to handle the numerical complications of maximizing the constrained likelihood model. Simulation studies are conducted on two distributions: Burr type X distribution and exponentiated exponential distribution. Results from simulation studies show that the proposed method is very accurate even when the sample sizes are small.

Keywords

Information technology, Statistics, Mathematics

URI

http://hdl.handle.net/10315/28237

Collections

Mathematics & Statistics

Full item page

Statistical Modeling to Information Retrieval for Searching from Big Text Data and Higher Order Inference for Reliability

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections