High-throughput identification of interacting protein-protein binding sites
1 Department of Chemistry and Biochemistry, University of California, San Diego, Gilman Drive, La Jolla, CA 92093-0743, USA
2 Department of Pharmacology, University of California, San Diego, Gilman Drive, La Jolla, CA 92093-0743, USA
3 San Diego Supercomputer Center, University of California, San Diego, Gilman Drive, La Jolla, CA 92093-0743, USA
BMC Bioinformatics 2007, 8:223 doi:10.1186/1471-2105-8-223Published: 27 June 2007
With the advent of increasing sequence and structural data, a number of methods have been proposed to locate putative protein binding sites from protein surfaces. Therefore, methods that are able to identify whether these binding sites interact are needed.
We have developed a new method using a machine learning approach to detect if protein binding sites, once identified, interact with each other. The method exploits information relating to sequence and structural complementary across protein interfaces and has been tested on a non-redundant data set consisting of 584 homo-dimers and 198 hetero-dimers extracted from the PDB. Results indicate 87.4% of the interacting binding sites and 68.6% non-interacting binding sites were correctly identified. Furthermore, we built a pipeline that links this method to a modified version of our previously developed method that predicts the location of binding sites.
We have demonstrated that this high-throughput pipeline is capable of identifying binding sites for proteins, their interacting binding sites and, ultimately, their binding partners on a large scale.