Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Software

PCOGR: Phylogenetic COG ranking as an online tool to judge the specificity of COGs with respect to freely definable groups of organisms

Florian Meereis and Michael Kaufmann*

Author Affiliations

The Protein Chemistry Group, Institute of Neurobiochemistry, Witten/Herdecke University, Stockumer Str. 10, 58448 Witten, Germany

For all author emails, please log on.

BMC Bioinformatics 2004, 5:150  doi:10.1186/1471-2105-5-150

Published: 15 October 2004

Abstract

Background

The rapidly increasing number of completely sequenced genomes led to the establishment of the COG-database which, based on sequence homologies, assigns similar proteins from different organisms to clusters of orthologous groups (COGs). There are several bioinformatic studies that made use of this database to determine (hyper)thermophile-specific proteins by searching for COGs containing (almost) exclusively proteins from (hyper)thermophilic genomes. However, public software to perform individually definable group-specific searches is not available.

Results

The tool described here exactly fills this gap. The software is accessible at http://www.uni-wh.de/pcogr webcite and is linked to the COG-database. The user can freely define two groups of organisms by selecting for each of the (current) 66 organisms to belong either to groupA, to the reference groupB or to be ignored by the algorithm. Then, for all COGs a specificity index is calculated with respect to the specificity to groupA, i. e. high scoring COGs contain proteins from the most of groupA organisms while proteins from the most organisms assigned to groupB are absent. In addition to ranking all COGs according to the user defined specificity criteria, a graphical visualization shows the distribution of all COGs by displaying their abundance as a function of their specificity indexes.

Conclusions

This software allows detecting COGs specific to a predefined group of organisms. All COGs are ranked in the order of their specificity and a graphical visualization allows recognizing (i) the presence and abundance of such COGs and (ii) the phylogenetic relationship between groupA- and groupB-organisms. The software also allows detecting putative protein-protein interactions, novel enzymes involved in only partially known biochemical pathways, and alternate enzymes originated by convergent evolution.