This article is part of the supplement: Selected articles from the IEEE International Conference on Bioinformatics and Biomedicine 2010
Gene Cluster Profile Vectors: a method to infer functionally related gene sets by grouping proximity-based gene clusters
School of Informatics and Computing, Indiana University, 150 S Woodlawn Ave, Bloomington, IN 47405, USA
BMC Genomics 2011, 12(Suppl 2):S2 doi:10.1186/1471-2164-12-S2-S2Published: 27 July 2011
Proximity-based methods and co-evolution-based phylogenetic profiles methods have been successfully used for the identification of functionally related genes. Proximity-based methods are effective for physically clustered genes while the phylogenetic profiles method is effective for co-occurring gene sets. However, both methods predict many false positives and false negatives. In this paper, we propose the Gene Cluster Profile Vector (GCPV) method, which combines these two methods by using phylogenetic profiles of whole gene clusters. The GCPV method is, currently, the only genome comparison based method that allows for the characterization of relationships between gene clusters based profiles of individual genes in clusters.
The GCPV method groups together reasonably related operons in E. coli about 60% of the time. The method is not sensitive to the choice of a reference genome set used and it outperforms the conventional phylogenetic profiles method. Finally, we show that the method works well for predicted gene clusters from C. crescentus and can serve as an important tool not only for understanding gene function, but also for elucidating mechanisms of general biological processes.
The GCPV method has shown to be an effective and robust approach to the prediction of functionally related gene sets from proximity-based gene clusters or operons.