Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Selected proceedings from the Automated Function Prediction Meeting 2011

Open Access Proceedings

Function prediction from networks of local evolutionary similarity in protein structure

Serkan Erdin12, Eric Venner1, Andreas Martin Lisewski12 and Olivier Lichtarge12*

Author Affiliations

1 Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA

2 Computational and Integrative Biomedical Research Center, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA

For all author emails, please log on.

BMC Bioinformatics 2013, 14(Suppl 3):S6  doi:10.1186/1471-2105-14-S3-S6

Published: 28 February 2013

Abstract

Background

Annotating protein function with both high accuracy and sensitivity remains a major challenge in structural genomics. One proven computational strategy has been to group a few key functional amino acids into templates and search for these templates in other protein structures, so as to transfer function when a match is found. To this end, we previously developed Evolutionary Trace Annotation (ETA) and showed that diffusing known annotations over a network of template matches on a structural genomic scale improved predictions of function. In order to further increase sensitivity, we now let each protein contribute multiple templates rather than just one, and also let the template size vary.

Results

Retrospective benchmarks in 605 Structural Genomics enzymes showed that multiple templates increased sensitivity by up to 14% when combined with single template predictions even as they maintained the accuracy over 91%. Diffusing function globally on networks of single and multiple template matches marginally increased the area under the ROC curve over 0.97, but in a subset of proteins that could not be annotated by ETA, the network approach recovered annotations for the most confident 20-23 of 91 cases with 100% accuracy.

Conclusions

We improve the accuracy and sensitivity of predictions by using multiple templates per protein structure when constructing networks of ETA matches and diffusing annotations.