Identification of specificity determining residues in peptide recognition domains using an information theoretic approach applied to large-scale binding maps
1 Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
2 Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong
3 Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
4 Department of Pharmacology, Yale University, New Haven, CT, USA
5 Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
6 Department of Computer Science, Yale University, New Haven, CT, USA
7 Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario, Canada
8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
9 Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
BMC Biology 2011, 9:53 doi:10.1186/1741-7007-9-53Published: 11 August 2011
Peptide Recognition Domains (PRDs) are commonly found in signaling proteins. They mediate protein-protein interactions by recognizing and binding short motifs in their ligands. Although a great deal is known about PRDs and their interactions, prediction of PRD specificities remains largely an unsolved problem.
We present a novel approach to identifying these Specificity Determining Residues (SDRs). Our algorithm generalizes earlier information theoretic approaches to coevolution analysis, to become applicable to this problem. It leverages the growing wealth of binding data between PRDs and large numbers of random peptides, and searches for PRD residues that exhibit strong evolutionary covariation with some positions of the statistical profiles of bound peptides. The calculations involve only information from sequences, and thus can be applied to PRDs without crystal structures. We applied the approach to PDZ, SH3 and kinase domains, and evaluated the results using both residue proximity in co-crystal structures and verified binding specificity maps from mutagenesis studies.
Our predictions were found to be strongly correlated with the physical proximity of residues, demonstrating the ability of our approach to detect physical interactions of the binding partners. Some high-scoring pairs were further confirmed to affect binding specificity using previous experimental results. Combining the covariation results also allowed us to predict binding profiles with higher reliability than two other methods that do not explicitly take residue covariation into account.
The general applicability of our approach to the three different domain families demonstrated in this paper suggests its potential in predicting binding targets and assisting the exploration of binding mechanisms.