Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Second International Workshop on Data and Text Mining in Bioinformatics (DTMBio) 2008

Open Access Proceedings

Passage relevance models for genomics search

Jay Urbain1*, Ophir Frieder2 and Nazli Goharian2

Author Affiliations

1 Electrical Engineering and Computer Science Department, Milwaukee School of Engineering, Milwaukee, WI, USA

2 Information Retrieval Lab, Illinois Institute of Technology, Chicago, IL, USA

For all author emails, please log on.

BMC Bioinformatics 2009, 10(Suppl 3):S3  doi:10.1186/1471-2105-10-S3-S3

Published: 19 March 2009

Abstract

We present a passage relevance model for integrating syntactic and semantic evidence of biomedical concepts and topics using a probabilistic graphical model. Component models of topics, concepts, terms, and document are represented as potential functions within a Markov Random Field. The probability of a passage being relevant to a biologist's information need is represented as the joint distribution across all potential functions. Relevance model feedback of top ranked passages is used to improve distributional estimates of query concepts and topics in context, and a dimensional indexing strategy is used for efficient aggregation of concept and term statistics. By integrating multiple sources of evidence including dependencies between topics, concepts, and terms, we seek to improve genomics literature passage retrieval precision. Using this model, we are able to demonstrate statistically significant improvements in retrieval precision using a large genomics literature corpus.