Email updates

Keep up to date with the latest news and content from BMC Systems Biology and BioMed Central.

Open Access Methodology article

Improving protein function prediction using domain and protein complexes in PPI networks

Wei Peng12, Jianxin Wang1*, Juan Cai1, Lu Chen1, Min Li1 and Fang-Xiang Wu13

Author Affiliations

1 School of Information Science and Engineering, Central South University, Changsha, Hunan 410083, PR China

2 Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650093, PR China

3 Department of Mechanical Engineering and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada

For all author emails, please log on.

BMC Systems Biology 2014, 8:35  doi:10.1186/1752-0509-8-35

Published: 24 March 2014

Abstract

Background

Characterization of unknown proteins through computational approaches is one of the most challenging problems in silico biology, which has attracted world-wide interests and great efforts. There have been some computational methods proposed to address this problem, which are either based on homology mapping or in the context of protein interaction networks.

Results

In this paper, two algorithms are proposed by integrating the protein-protein interaction (PPI) network, proteins’ domain information and protein complexes. The one is domain combination similarity (DCS), which combines the domain compositions of both proteins and their neighbors. The other is domain combination similarity in context of protein complexes (DSCP), which extends the protein functional similarity definition of DCS by combining the domain compositions of both proteins and the complexes including them. The new algorithms are tested on networks of the model species of Saccharomyces cerevisiae to predict functions of unknown proteins using cross validations. Comparing with other several existing algorithms, the results have demonstrated the effectiveness of our proposed methods in protein function prediction. Furthermore, the algorithm DSCP using experimental determined complex data is robust when a large percentage of the proteins in the network is unknown, and it outperforms DCS and other several existing algorithms.

Conclusions

The accuracy of predicting protein function can be improved by integrating the protein-protein interaction (PPI) network, proteins’ domain information and protein complexes.