Log on / register
Feedback | Support | My details
Open AccessHighly AccessMethodology article

ResBoost: characterizing and predicting catalytic residues in enzymes

Ron Alterovitz1 email, Aaron Arvey2 email, Sriram Sankararaman3 email, Carolina Dallett4 email, Yoav Freund2 email and Kimmen Sjölander4 email

1Department of Computer Science, University of North Carolina at Chapel Hill, USA

2Department of Computer Science and Engineering, University of California, San Diego, USA

3Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, USA

4Department of Bioengineering, University of California, Berkeley, USA

author email corresponding author email

BMC Bioinformatics 2009, 10:197doi:10.1186/1471-2105-10-197

Published: 27 June 2009

Abstract

Background

Identifying the catalytic residues in enzymes can aid in understanding the molecular basis of an enzyme's function and has significant implications for designing new drugs, identifying genetic disorders, and engineering proteins with novel functions. Since experimentally determining catalytic sites is expensive, better computational methods for identifying catalytic residues are needed.

Results

We propose ResBoost, a new computational method to learn characteristics of catalytic residues. The method effectively selects and combines rules of thumb into a simple, easily interpretable logical expression that can be used for prediction. We formally define the rules of thumb that are often used to narrow the list of candidate residues, including residue evolutionary conservation, 3D clustering, solvent accessibility, and hydrophilicity. ResBoost builds on two methods from machine learning, the AdaBoost algorithm and Alternating Decision Trees, and provides precise control over the inherent trade-off between sensitivity and specificity. We evaluated ResBoost using cross-validation on a dataset of 100 enzymes from the hand-curated Catalytic Site Atlas (CSA).

Conclusion

ResBoost achieved 85% sensitivity for a 9.8% false positive rate and 73% sensitivity for a 5.7% false positive rate. ResBoost reduces the number of false positives by up to 56% compared to the use of evolutionary conservation scoring alone. We also illustrate the ability of ResBoost to identify recently validated catalytic residues not listed in the CSA.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.