Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

Binding Site Prediction for Protein-Protein Interactions and Novel Motif Discovery using Re-occurring Polypeptide Sequences

Adam Amos-Binks1, Catalin Patulea3, Sylvain Pitre1, Andrew Schoenrock1, Yuan Gui2, James R Green3, Ashkan Golshani2 and Frank Dehne1*

  • * Corresponding author: Frank Dehne frank@dehne.net

  • † Equal contributors

Author affiliations

1 School of Computer Science, Carleton University, Ottawa, ON K1S5B6, Canada

2 Department of Biology, Carleton University, Ottawa, ON K1S5B6, Canada

3 Department of Systems and Computer Engineering, Carleton University, Ottawa, ON K1S5B6, Canada

For all author emails, please log on.

Citation and License

BMC Bioinformatics 2011, 12:225  doi:10.1186/1471-2105-12-225

Published: 2 June 2011

Abstract

Background

While there are many methods for predicting protein-protein interaction, very few can determine the specific site of interaction on each protein. Characterization of the specific sequence regions mediating interaction (binding sites) is crucial for an understanding of cellular pathways. Experimental methods often report false binding sites due to experimental limitations, while computational methods tend to require data which is not available at the proteome-scale. Here we present PIPE-Sites, a novel method of protein specific binding site prediction based on pairs of re-occurring polypeptide sequences, which have been previously shown to accurately predict protein-protein interactions. PIPE-Sites operates at high specificity and requires only the sequences of query proteins and a database of known binary interactions with no binding site data, making it applicable to binding site prediction at the proteome-scale.

Results

PIPE-Sites was evaluated using a dataset of 265 yeast and 423 human interacting proteins pairs with experimentally-determined binding sites. We found that PIPE-Sites predictions were closer to the confirmed binding site than those of two existing binding site prediction methods based on domain-domain interactions, when applied to the same dataset. Finally, we applied PIPE-Sites to two datasets of 2347 yeast and 14,438 human novel interacting protein pairs predicted to interact with high confidence. An analysis of the predicted interaction sites revealed a number of protein subsequences which are highly re-occurring in binding sites and which may represent novel binding motifs.

Conclusions

PIPE-Sites is an accurate method for predicting protein binding sites and is applicable to the proteome-scale. Thus, PIPE-Sites could be useful for exhaustive analysis of protein binding patterns in whole proteomes as well as discovery of novel binding motifs. PIPE-Sites is available online at http://pipe-sites.cgmlab.org/ webcite.