Detecting overlapping protein complexes based on a generative model with functional and topological properties
1 School of Mathematics and Statistics, Central China Normal University, Luoyu Road, 430079 Wuhan, China
2 Intelligent Data Center and Department of Mathematics, Sun Yat-Sen University, Xingang Road West, 510275 Guangzhou, China
3 Department of Electronic Engineering, City University of Hong Kong, Tat Chee Avenue, Hong Kong, China
BMC Bioinformatics 2014, 15:186 doi:10.1186/1471-2105-15-186Published: 13 June 2014
Identification of protein complexes can help us get a better understanding of cellular mechanism. With the increasing availability of large-scale protein-protein interaction (PPI) data, numerous computational approaches have been proposed to detect complexes from the PPI networks. However, most of the current approaches do not consider overlaps among complexes or functional annotation information of individual proteins. Therefore, they might not be able to reflect the biological reality faithfully or make full use of the available domain-specific knowledge.
In this paper, we develop a Generative Model with Functional and Topological Properties (GMFTP) to describe the generative processes of the PPI network and the functional profile. The model provides a working mechanism for capturing the interaction structures and the functional patterns of proteins. By combining the functional and topological properties, we formulate the problem of identifying protein complexes as that of detecting a group of proteins which frequently interact with each other in the PPI network and have similar annotation patterns in the functional profile. Using the idea of link communities, our method naturally deals with overlaps among complexes. The benefits brought by the functional properties are demonstrated by real data analysis. The results evaluated using four criteria with respect to two gold standards show that GMFTP has a competitive performance over the state-of-the-art approaches. The effectiveness of detecting overlapping complexes is also demonstrated by analyzing the topological and functional features of multi- and mono-group proteins.
Based on the results obtained in this study, GMFTP presents to be a powerful approach for the identification of overlapping protein complexes using both the PPI network and the functional profile. The software can be downloaded from http://email@example.com/dai/others/GMFTP.zip webcite.