Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

This article is part of the supplement: Selected articles from the IEEE International Conference on Bioinformatics and Biomedicine 2010

Open Access Proceedings

Gene Cluster Profile Vectors: a method to infer functionally related gene sets by grouping proximity-based gene clusters

Vikas Rao Pejaver and Sun Kim*

Author affiliations

School of Informatics and Computing, Indiana University, 150 S Woodlawn Ave, Bloomington, IN 47405, USA

For all author emails, please log on.

Citation and License

BMC Genomics 2011, 12(Suppl 2):S2  doi:10.1186/1471-2164-12-S2-S2

Published: 27 July 2011

Abstract

Background

Proximity-based methods and co-evolution-based phylogenetic profiles methods have been successfully used for the identification of functionally related genes. Proximity-based methods are effective for physically clustered genes while the phylogenetic profiles method is effective for co-occurring gene sets. However, both methods predict many false positives and false negatives. In this paper, we propose the Gene Cluster Profile Vector (GCPV) method, which combines these two methods by using phylogenetic profiles of whole gene clusters. The GCPV method is, currently, the only genome comparison based method that allows for the characterization of relationships between gene clusters based profiles of individual genes in clusters.

Results

The GCPV method groups together reasonably related operons in E. coli about 60% of the time. The method is not sensitive to the choice of a reference genome set used and it outperforms the conventional phylogenetic profiles method. Finally, we show that the method works well for predicted gene clusters from C. crescentus and can serve as an important tool not only for understanding gene function, but also for elucidating mechanisms of general biological processes.

Conclusions

The GCPV method has shown to be an effective and robust approach to the prediction of functionally related gene sets from proximity-based gene clusters or operons.