Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Methodology article

FACT: Functional annotation transfer between proteins with similar feature architectures

Tina Koestler1234, Arndt von Haeseler1234 and Ingo Ebersberger1234*

Author Affiliations

1 Center for Integrative Bioinformatics Vienna (CIBIV), Max F. Perutz Laboratories, Dr. Bohrgasse 9, A-1030 Vienna, Austria

2 University of Vienna, Vienna, Austria

3 Medical University of Vienna, Vienna, Austria

4 University of Veterinary Medicine, Vienna, Austria

For all author emails, please log on.

BMC Bioinformatics 2010, 11:417  doi:10.1186/1471-2105-11-417

Published: 9 August 2010

Abstract

Background

The increasing number of sequenced genomes provides the basis for exploring the genetic and functional diversity within the tree of life. Only a tiny fraction of the encoded proteins undergoes a thorough experimental characterization. For the remainder, bioinformatics annotation tools are the only means to infer their function. Exploiting significant sequence similarities to already characterized proteins, commonly taken as evidence for homology, is the prevalent method to deduce functional equivalence. Such methods fail when homologs are too diverged, or when they have assumed a different function. Finally, due to convergent evolution, functional equivalence is not necessarily linked to common ancestry. Therefore complementary approaches are required to identify functional equivalents.

Results

We present the Feature Architecture Comparison Tool http://www.cibiv.at/FACT webcite to search for functionally equivalent proteins. FACT uses the similarity between feature architectures of two proteins, i.e., the arrangements of functional domains, secondary structure elements and compositional properties, as a proxy for their functional equivalence. A scoring function measures feature architecture similarities, which enables searching for functional equivalents in entire proteomes. Our evaluation of 9,570 EC classified enzymes revealed that FACT, using the full feature, set outperformed the existing architecture-based approaches by identifying significantly more functional equivalents as highest scoring proteins. We show that FACT can identify functional equivalents that share no significant sequence similarity. However, when the highest scoring protein of FACT is also the protein with the highest local sequence similarity, it is in 99% of the cases functionally equivalent to the query. We demonstrate the versatility of FACT by identifying a missing link in the yeast glutathione metabolism and also by searching for the human GolgA5 equivalent in Trypanosoma brucei.

Conclusions

FACT facilitates a quick and sensitive search for functionally equivalent proteins in entire proteomes. FACT is complementary to approaches using sequence similarity to identify proteins with the same function. Thus, FACT is particularly useful when functional equivalents need to be identified in evolutionarily distant species, or when functional equivalents are not homologous. The most reliable annotation transfers, however, are achieved when feature architecture similarity and sequence similarity are jointly taken into account.