Protein complex prediction via verifying and reconstructing the topology of domain-domain interactions
1 Institute for Advanced Biosciences, Keio University, 403-1, Daihoji, Tsuruoka, Yamagata 997-0017, Japan
2 Systems Biology Program, Graduate School of Media and Governance, Keio University, Endo 5322, Fujisawa, Kanagawa 252-8520, Japan
3 Faculty of Environment and Information Studies, Keio University, Endo 5322, Fujisawa, Kanagawa 252-8520, Japan
4 Graduate School of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku, Yokohama, Kanagawa 223-8522, Japan
5 IBM Research - Tokyo, 1623-14 Shimo-tsuruma, Yamato, Kanagawa, 242-8502 Japan
6 Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, 113-8656 Tokyo, Japan
BMC Bioinformatics 2010, 11:350 doi:10.1186/1471-2105-11-350Published: 28 June 2010
High-throughput methods for detecting protein-protein interactions enable us to obtain large interaction networks, and also allow us to computationally identify the associations of proteins as protein complexes. Although there are methods to extract protein complexes as sets of proteins from interaction networks, the extracted complexes may include false positives because they do not account for the structural limitations of the proteins and thus do not check that the proteins in the extracted complex can simultaneously bind to each other. In addition, there have been few searches for deeper insights into the protein complexes, such as of the topology of the protein-protein interactions or into the domain-domain interactions that mediate the protein interactions.
Here, we introduce a combinatorial approach for prediction of protein complexes focusing not only on determining member proteins in complexes but also on the DDI/PPI organization of the complexes. Our method analyzes complex candidates predicted by the existing methods. It searches for optimal combinations of domain-domain interactions in the candidates based on an assumption that the proteins in a candidate can form a true protein complex if each of the domains is used by a single protein interaction. This optimization problem was mathematically formulated and solved using binary integer linear programming. By using publicly available sets of yeast protein-protein interactions and domain-domain interactions, we succeeded in extracting protein complex candidates with an accuracy that is twice the average accuracy of the existing methods, MCL, MCODE, or clustering coefficient. Although the configuring parameters for each algorithm resulted in slightly improved precisions, our method always showed better precision for most values of the parameters.
Our combinatorial approach can provide better accuracy for prediction of protein complexes and also enables to identify both direct PPIs and DDIs that mediate them in complexes.