Residue contact-count potentials are as effective as residue-residue contact-type potentials for ranking protein decoys
The Max Planck Institute for Molecular Genetics, Berlin, Germany
BMC Structural Biology 2008, 8:53 doi:10.1186/1472-6807-8-53Published: 8 December 2008
For over 30 years potentials of mean force have been used to evaluate the relative energy of protein structures. The most commonly used potentials define the energy of residue-residue interactions and are derived from the empirical analysis of the known protein structures. However, single-body residue 'environment' potentials, although widely used in protein structure analysis, have not been rigorously compared to these classical two-body residue-residue interaction potentials. Here we do not try to combine the two different types of residue interaction potential, but rather to assess their independent contribution to scoring protein structures.
A data set of nearly three thousand monomers was used to compare pairwise residue-residue 'contact-type' propensities to single-body residue 'contact-count' propensities. Using a large and standard set of protein decoys we performed an in-depth comparison of these two types of residue interaction propensities. The scores derived from the contact-type and contact-count propensities were assessed using two different performance metrics and were compared using 90 different definitions of residue-residue contact. Our findings show that both types of score perform equally well on the task of discriminating between near-native protein decoys. However, in a statistical sense, the contact-count based scores were found to carry more information than the contact-type based scores.
Our analysis has shown that the performance of either type of score is very similar on a range of different decoys. This similarity suggests a common underlying biophysical principle for both types of residue interaction propensity. However, several features of the contact-count based propensity suggests that it should be used in preference to the contact-type based propensity. Specifically, it has been shown that contact-counts can be predicted from sequence information alone. In addition, the use of a single-body term allows for efficient alignment strategies using dynamic programming, which is useful for fold recognition, for example. These facts, combined with the relative simplicity of the contact-count propensity, suggests that contact-counts should be studied in more detail in the future.