Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Selected articles from the IEEE International Conference on Bioinformatics and Biomedicine 2010

Open Access Proceedings

Promoting ranking diversity for genomics search with relevance-novelty combined model

Xiaoshi Yin12, Zhoujun Li123*, Jimmy Xiangji Huang4 and Xiaohua Hu5

Author Affiliations

1 State Key Laboratory of Software Development Environment, Beihang University, Beijing 100191, China

2 School of Computer Science and Engineering, Beihang University, Beijing, China

3 Beijing Key Laboratory of Network Technology, Beihang University, Beijing, China

4 School of Information Technology, York University, Toronto, Canada

5 College of Information Science and Technology, Drexel University, Philadelphia, PA, USA

For all author emails, please log on.

BMC Bioinformatics 2011, 12(Suppl 5):S8  doi:10.1186/1471-2105-12-S5-S8

Published: 27 July 2011

Abstract

Background

In the biomedical domain, the desired information of a question (query) asked by biologists usually is a list of a certain type of entities covering different aspects that are related to the question, such as genes, proteins, diseases, mutations, etc. Hence it is important for a biomedical information retrieval system to be able to provide comprehensive and diverse answers to fulfill biologists’ information needs. However, traditional retrieval models assume that the relevance of a document is independent of the relevance of other documents. This assumption may result in high redundancy and low diversity in the retrieval ranked lists.

Results

In this paper, we propose a relevance-novelty combined model, named RelNov model, based on the framework of an undirected graphical model. It consists of two component models, namely the aspect-term relevance model and the aspect-term novelty model. They model the relevance of a document and the novelty of a document respectively. We show that our approach can achieve 16.4% improvement over the highest aspect level MAP reported in the TREC 2007 Genomics track, and 9.8% improvement over the highest passage level MAP reported in the TREC 2007 Genomics track.

Conclusions

The proposed combination model which models aspects, terms, topic relevance and document novelty as potential functions is demonstrated to be effective in promoting ranking diversity as well as in improving relevance of ranked lists for genomics search. We also show that the use of aspect plays an important role in the model. Moreover, the proposed model can integrate various different relevance and novelty measures easily.