This article is part of the supplement: Advanced intelligent computing theories and their applications in bioinformatics. Proceedings of the 2011 International Conference on Intelligent Computing (ICIC 2011)
Inferring a protein interaction map of Mycobacterium tuberculosis based on sequences and interologs
1 Key Laboratory of Systems Biology, SIBS-Novo Nordisk Translational Research Centre for PreDiabetes, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
2 National Center for Mathematics and Interdisciplinary Sciences, Chinese Academy of Sciences, Beijing 100190, China
3 Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China
4 Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
5 Hong Kong Bioinformatics Centre, School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin N. T., Hong Kong, China
6 Collaborative Research Center for Innovative Mathematical Modelling, Institute of Industrial Science, University of Tokyo, Tokyo 153-8505, Japan
BMC Bioinformatics 2012, 13(Suppl 7):S6 doi:10.1186/1471-2105-13-S7-S6Published: 8 May 2012
Mycobacterium tuberculosis is an infectious bacterium posing serious threats to human health. Due to the difficulty in performing molecular biology experiments to detect protein interactions, reconstruction of a protein interaction map of M. tuberculosis by computational methods will provide crucial information to understand the biological processes in the pathogenic microorganism, as well as provide the framework upon which new therapeutic approaches can be developed.
In this paper, we constructed an integrated M. tuberculosis protein interaction network by machine learning and ortholog-based methods. Firstly, we built a support vector machine (SVM) method to infer the protein interactions of M. tuberculosis H37Rv by gene sequence information. We tested our predictors in Escherichia coli and mapped the genetic codon features underlying its protein interactions to M. tuberculosis. Moreover, the documented interactions of 14 other species were mapped to the interactome of M. tuberculosis by the interolog method. The ensemble protein interactions were validated by various functional relationships, i.e., gene coexpression, evolutionary relationship and functional similarity, extracted from heterogeneous data sources. The accuracy and validation demonstrate the effectiveness and efficiency of our framework.
A protein interaction map of M. tuberculosis is inferred from genetic codons and interologs. The prediction accuracy and numerically experimental validation demonstrate the effectiveness and efficiency of our method. Furthermore, our methods can be straightforwardly extended to infer the protein interactions of other bacterial species.