Identifying metabolic enzymes with multiple types of association evidence
-
* Corresponding authors: Dennis Vitkup vitkup@dbmi.columbia.edu - George M Church g1m1c1@receptor.med.harvard.edu
1 Department of Genetics, New Research Building (NRB) Room 238, 77 Ave. Louis Pasteur, Harvard Medical School, Boston, MA 02115, USA
2 Center for Computational Biology and Bioinformatics, Department of Biomedical Informatics, Columbia University, 1150 St. Nicholas Ave., New York, NY 10032, USA
3 Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive 0404, Room 4126, La Jolla, CA 92093, USA
BMC Bioinformatics 2006, 7:177 doi:10.1186/1471-2105-7-177
Published: 29 March 2006Abstract
Background
Existing large-scale metabolic models of sequenced organisms commonly include enzymatic functions which can not be attributed to any gene in that organism. Existing computational strategies for identifying such missing genes rely primarily on sequence homology to known enzyme-encoding genes.
Results
We present a novel method for identifying genes encoding for a specific metabolic function based on a local structure of metabolic network and multiple types of functional association evidence, including clustering of genes on the chromosome, similarity of phylogenetic profiles, gene expression, protein fusion events and others. Using E. coli and S. cerevisiae metabolic networks, we illustrate predictive ability of each individual type of association evidence and show that significantly better predictions can be obtained based on the combination of all data. In this way our method is able to predict 60% of enzyme-encoding genes of E. coli metabolism within the top 10 (out of 3551) candidates for their enzymatic function, and as a top candidate within 43% of the cases.
Conclusion
We illustrate that a combination of genome context and other functional association evidence is effective in predicting genes encoding metabolic enzymes. Our approach does not rely on direct sequence homology to known enzyme-encoding genes, and can be used in conjunction with traditional homology-based metabolic reconstruction methods. The method can also be used to target orphan metabolic activities.