Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA

Department of Molecular and Integrative Physiology, Department of Biochemistry, UIUC Programs in Biophysics, Neuroscience, and Bioengineering, National Center for Supercomputing Applications, and Beckman Institute, University of Illinois, Urbana, IL 61801, USA

Abstract

Background

Accurate protein loop structure models are important to understand functions of many proteins. Identifying the native or near-native models by distinguishing them from the misfolded ones is a critical step in protein loop structure prediction.

Results

We have developed a Pareto Optimal Consensus (POC) method, which is a consensus model ranking approach to integrate multiple knowledge- or physics-based scoring functions. The procedure of identifying the models of best quality in a model set includes: 1) identifying the models at the Pareto optimal front with respect to a set of scoring functions, and 2) ranking them based on the fuzzy dominance relationship to the rest of the models. We apply the POC method to a large number of decoy sets for loops of 4- to 12-residue in length using a functional space composed of several carefully-selected scoring functions: Rosetta, DOPE, DDFIRE, OPLS-AA, and a triplet backbone dihedral potential developed in our lab. Our computational results show that the sets of Pareto-optimal decoys, which are typically composed of ~20% or less of the overall decoys in a set, have a good coverage of the best or near-best decoys in more than 99% of the loop targets. Compared to the individual scoring function yielding best selection accuracy in the decoy sets, the POC method yields 23%, 37%, and 64% less false positives in distinguishing the native conformation, indentifying a near-native model (RMSD < 0.5A from the native) as top-ranked, and selecting at least one near-native model in the top-5-ranked models, respectively. Similar effectiveness of the POC method is also found in the decoy sets from membrane protein loops. Furthermore, the POC method outperforms the other popularly-used consensus strategies in model ranking, such as rank-by-number, rank-by-rank, rank-by-vote, and regression-based methods.

Conclusions

By integrating multiple knowledge- and physics-based scoring functions based on Pareto optimality and fuzzy dominance, the POC method is effective in distinguishing the best loop models from the other ones within a loop model set.

Background

Protein loop structure modeling is important in structural biology for its wide applications, including determining the surface loop regions in homology modeling

In many loop modeling methods

In practice, the value of computer-generated protein loop models in biological research relies critically on their accuracy. While efficiently sampling the protein loop conformation space to produce sufficient number of low-energy models to cover conformations with good structures remains a challenging issue, another critical problem is the insensitivity of the existing protein scoring functions. These scoring functions are developed to estimate the energy of the protein molecule. The insensitivity of the scoring functions leads to difficulty in distinguishing the native or native-like conformations from the erroneous models, and thus restricts the loop structure prediction accuracy. Therefore, selecting the highest quality loop models from a number of other models is a critical step in solving the protein loop structure prediction problem.

The scoring functions play a significant role in protein structure assessment and selection. Although a number of scoring functions are currently available for protein loop model evaluation, there is no generally reliable one that can always distinguish the native or near native models. Every existing scoring function has its own pros and cons. Recently, the strategy of using multiple scoring functions to estimate the quality of models and improve selection was proposed in protein folding and protein-ligand docking

Similar to structure prediction in an overall protein, the scoring functions that have been used in loop modeling can be categorized into knowledge-based

There are problems in theoretical justification of both the physics- and knowledge-based scoring functions for protein structure modeling. Ideally, a physics-based scoring function would be evaluated with quantum mechanics, in which case the score could reflect the true energy. In computation practice, quantum mechanics is wildly intractable due to the size of protein molecule. As a compromise, the physics-based scoring functions (force fields) are developed mainly based on classical physics to approximate the true energy of a protein molecule. On the other hand, the knowledge-based functions derive their rules from the existing experimental structure data, typically by applying the inverse Boltzmann law. However, because compared to the unknown structures, the known structures are in an extremely small fraction, the data used to develop knowledge-based functions are potentially undersampled

RMSD-Score Plot of

**RMSD-Score Plot of **
**(70:78) Decoy Set in Various Scoring Functions**

In this paper, we present a Pareto Optimality Consensus (POC) method based on the Pareto optimality

Methods

The consensus Strategy

Although each scoring function may have certain insensitivity and inaccuracy, combining multiple, carefully selected scoring functions may effectively tolerate the deficiencies existent in the single scoring functions. For example, as shown in Figure

Multiple Scoring Functions Coordinate Plot of Decoys in

**Multiple Scoring Functions Coordinate Plot of Decoys in **
**(70:78) Decoy Set**

The Pareto Optimality Consensus Method

The rationale of the POC method is to rank a model according to its Pareto-dominance relationship to the other models in the model set. The first step of the POC method is to identify models with Pareto-optimality. The definition of the Pareto-optimality

i) for each scoring function _{i}(.), _{i}(_{i}(

ii) there is at least one scoring function _{j}(.) where _{j}(_{j}(

By definition, the models which are not dominated by any other models in the model set form the Pareto-optimal solution set. A Pareto-optimal model possesses certain optimality compared to the other ones in the model set.

Once the Pareto-optimal models are identified, the next step in the POC method is to rank these models, so that the model that exhibits most optimality over other models in the model set will have the best rank. A simple solution, which is used in several evolutionary algorithms for multi-objective optimization

Example of Fuzzy Pareto Dominance

**Example of Fuzzy Pareto Dominance**

To more accurately measure the dominance relationship, we adopt a fuzzy scheme _{i}(_{i}(_{i}(_{i}(

Finally, the fuzzy Pareto dominance relation between two models _{a }where

for all normalized scoring functions _{i}(_{p }where

for all normalized scoring functions _{i}(

For the example shown in Figure _{a}(A, C) = 1.0, _{p}(A, C) = 0.083, _{a}(A, B) = 1.0, and _{p}(A, B) = 0.167. As a result, A shows a more significant dominance to C than to B in the fuzzy dominance scheme.

The ranking value for model _{i}, _{i}), is computed as

which will be used to rank the Pareto-optimal models. For ranking of the whole model set, we firstly identify the Pareto-optimal models and rank them according to fuzzy Pareto dominance relationship. Then, we remove the Pareto-optimal models, identify the Pareto-optimal models for the rest of the models, and assign ranks to them. The procedure is repeated until there are no more models left in the model set.

Results

Effectiveness of the Pareto Optimal Models

Because in the POC method, selection and ranking are based on Pareto optimality, the quality of the Pareto-optimal models is critical. The Pareto-optimal models include not only those optimums in individual scoring functions, but also the non-dominated ones yielding certain optimality in the (linear or non-linear) combination of various scoring functions. In our computational experiment, five scoring functions, including Rosetta, DDFIRE, DOPE, triplet backbone dihedral, and OPLS-AA/SGB, are selected to form the function space. Figure

Average number of decoys and average number of Pareto-optimal decoys for loop targets ranging from 4- to 12-residue in Jacobson's decoy sets

**Average number of decoys and average number of Pareto-optimal decoys for loop targets ranging from 4- to 12-residue in Jacobson's decoy sets**. Only a small fraction (3~22%) of the decoys of a loop target are Pareto-optimal decoys.

Number of the targets whose Pareto-optimal decoys contain at least one decoy within certain RMSD cutoff from the best decoy

**Number of the targets whose Pareto-optimal decoys contain at least one decoy within certain RMSD cutoff from the best decoy**. The Pareto-optimal decoys can effectively cover the best decoy or one close to the best decoy in a target's decoy set.

Effectiveness of the Pareto optimal decoys

**Effectiveness of the Pareto optimal decoys**. The best decoy with minimum RMSD, or one very close to the best decoy (< 0.1A) are within the Pareto optimal decoys in 9-residue loop targets

Efficiency in Identifying Near-Native Structures

We applied the POC method to the decoy sets generated by Jacobson et al. The decoy set for each target contains very good models (MODEL 1 and MODEL 2) derived from the native structure by optimizing the OPLS-AA/SGB force field as well as other models generated by hierarchical comparative modeling

By considering a decoy with RMSD less than 0.5A as a near-native one, a false positive is a non-near-native decoy with a high rank. Figure

Number of False Positives

**Number of False Positives**. Number of cases where the top-ranked decoy is a false positive and the near-native structures are missed in the top-5-ranked decoys in 502 loop targets in the POC method and individual scoring functions

We use the receiver operating characteristic (ROC) curves to evaluate the ranking performance of each individual scoring function as well as the POC method for each loop target, according to the method described in

ROC Curves for Decoys in

**ROC Curves for Decoys in ****(244:252) and 153 l(98:109)**. In these ROC curves, the true positives are the number of top-^{th }best RMSD in a decoy set and

Average ROC-AUC Comparison in Jacobson's Decoy Sets and the Membrane Protein Loop Decoy (MPD) Sets

**POC**

**Rosetta**

**DFIRE**

**DOPE**

**Triplet**

**OPLSAA**

Jacobson

0.780920

0.752171

0.741472

0.737116

0.747701

0.608012

MPD

0.640534

0.592584

0.635511

0.612899

0.606396

N/A

Figure

Average RMSD of the best models selected from 5-top-ranked decoys in Jacobson's loop sets ranging from 4 to 12 residues

**Average RMSD of the best models selected from 5-top-ranked decoys in Jacobson's loop sets ranging from 4 to 12 residues**.

Figure

RMSD of the best-ranked decoy in 11-residue loop targets of Jacobson's decoy sets

**RMSD of the best-ranked decoy in 11-residue loop targets of Jacobson's decoy sets**

Respectively, Figures

Comparison of false positive rates in POC and individual scoring functions using different RMSD cutoffs in membrane protein loop decoy sets

**Comparison of false positive rates in POC and individual scoring functions using different RMSD cutoffs in membrane protein loop decoy sets**

Percentage of targets in the membrane protein loop decoy sets where the top-ranked decoy is within 1.0A from the native

**Percentage of targets in the membrane protein loop decoy sets where the top-ranked decoy is within 1.0A from the native**

We also applied the POC method with the native structure mixed in the decoy sets generated by Jacobson et al

**Efficiency of POC in Identifying the Native Structures**. The supplementary file describes the efficiency of the POC method with the native structure mixed in the decoy sets generated by Jacobson et al. The POC method also leads to less false positives compared to individual scoring functions.

Click here for file

Discussion

Comparison to Regression-based Consensus Method

A popular approach to take advantage of multiple scoring functions is to build a consensus scoring function by combining the individual scores using linear regression _{1 }and _{2}. When a set of weights are determined by regression, a contour line is formed and the minimum solution of the consensus function corresponds to a model on the Pareto optimal front, which is the tangent point of the contour line and the model solution space. However, there exists no contour line that can produce a tangent point with the feasible solution space in the region BC in the Pareto optimal front. This is because before a tangent point is reached in BC, the contour line becomes a tangent at another point at AB or CD zones, which yields a lower overall consensus function value. In other words, models in the concave region BC will never be selected in a consensus scoring function method, although these models show certain Pareto-optimality relative to others in the model set. Some regions in the Pareto optimal front may still be unreachable even if nonlinear regression is used to combine various terms.

Deficiency of Regression-based Consensus Method

**Deficiency of Regression-based Consensus Method**. Pareto-optimal Models at the Concave Pareto Optimal Front Are Unreachable in a Regression-based Consensus Scoring Function

Figure

Selection performance comparison between POC and SVR in identifying the top-5 decoys in Jacobson's decoy sets

**Selection performance comparison between POC and SVR in identifying the top-5 decoys in Jacobson's decoy sets**

Another major drawback of the regression-based consensus method is its dependence on the size, composition and generality of the training set used to derive the weights. Similar to the vote-based or rank-based consensus methods, POC does not require a training procedure. The selection and ranking solely depend on evaluation of the dominance relationship among the decoys.

Comparison to Rank-by-Number, Rank-by-Rank, and Rank-by-Vote Methods

The vote-based consensus method is another strategy of multiple scoring functions selection method, which takes advantage of the observation that similar models voted by more scoring functions tend to be more accurate than those having fewer votes. However, the disadvantage of vote-based consensus methods is that it is very sensitive to the artificially-set vote threshold value

Table

Selection Accuracy Comparison of Various Consensus Strategies and Best Individual Scoring Function in Jacobson's Decoy Sets of 502 Loop Targets

**POC**

**Rank-by-Number**

**Rank-by-Rank**

**Rank-by-Vote**

**Best Individual Scoring Function**

Top-ranked decoy < 0.5A

409

397

399

379

357

Best Top-5-ranked decoys < 0.5A

470

444

445

412

413

Comparison to Another Selection Method

Lin and Head-Gordon recently presented a new physics-based energy function with an implicit solvent model, so-called HPMF

Selection Accuracy of the POC method compared to the HPMF Method

**Loop Length**

**HPMF**

**POC**

4

0.31A

0.27A

6

0.61A

0.34A

8

0.70A

0.53A

10

0.77A

0.49A

11

0.67A

0.39A

12

0.39A

0.32A

Result Analysis

In this section, we analyze, from the biological perspective, the results obtained for several loop targets. These targets include

For the test case of

The optimal decoy selected by our triplet potential for loop

**The optimal decoy selected by our triplet potential for loop ****(28:38)**. The decoy makes internal hydrogen bonds (black dashed lines) but few contacts with the protein frame.

On the other hand, Rosetta's best scored decoy has the opposite problem: It makes some good contacts with the protein frame but has a poor choice of backbone torsion angle combinations. For example, the Thr37 residue has the following backbone torsion angle combination: phi = 80°, psi = -45°, which falls on a region of the Threonine's Ramachandran map that is disallowed due to local steric clashes. The success of the POC method in this case is justified by selectively relying on the other scoring functions that have good performances.

A somewhat opposite example is provided by the

The only case, from all the 502 loop targets studied here, where POC fails to capture a native-like structure (within 0.5A cutoff) on the Parento Optimal Front, is the

Analysis of the native loop

**Analysis of the native loop ****(31:38)**. The hydrophobic residues Phe36-Leu37 (enclosed by the surface) are buried in a stable protein hydrophobic core, being surrounded by many carbon atoms.

The best decoy selected by POC for this loop shows many favorable contacts, including the hydrophobic interaction between Phe36 and Leu37 side-chains. But they are not buried in a protein hydrophobic core in this case. Also, this decoy's surrounding surface, shown in Figure

The best decoy selected by POC for target

**The best decoy selected by POC for target ****(31:38)**. The decoy forms an unfavorable internal cavity that is not occupied by other protein atoms.

Limitations of the POC Method

Similar to the other consensus methods, a limitation of the POC method depends on the accuracy of the scoring functions involved in the consensus scheme. If the large majority of the scoring functions have poor accuracy, the consensus scheme is unlikely to select decoys with high resolution. The effectiveness of the POC method also depends on the quality of the decoys generated. POC is a selection and ranking scheme and thus it is unable to generate better decoys than the best one in a decoy set.

Another minor disadvantage of the POC method is the decoy selection and ranking time when the decoy set is large. For a set of ^{2}) because of the requirement of evaluating pair-wise decoy dominance relationship, whereas the ranking time scaling in regression-based, rank-based, or vote-based consensus methods is

Conclusions

The POC method is shown to be effective in distinguishing the best models from the other ones within Jacobson's loop decoy sets and the membrane protein loop decoy sets. It is clear that a combination of multiple, carefully-selected physics- and knowledge-based scoring functions can significantly reduce the number of false positives compared to using an individual scoring function only. Moreover, identifying the decoys at the Pareto optimal front and ranking these decoys based on the fuzzy dominance relationship against the other decoys in the set have led to higher model selection accuracy in the POC method than in the other consensus strategies including rank-by-vote, rank-by-number, rank-by-rank, and regression-based methods. In addition to protein loop structure prediction, the POC approach may also be used in applications of protein folding, protein-protein interaction, and protein-ligand docking.

Our current POC implementation does not bias to any individual scoring function. However, there may still be improvement space for the POC method. For example, the POC may couple with a training algorithm to measure the efficiency of a scoring function and then certain bias to some scoring functions can be incorporated in evaluating the fuzzy Pareto dominance relation. This will be one of our future research directions.

Authors' contributions

YL conceived and implemented the method and carried out the computation. IR performed the biological analysis. SC designed the computational experiment using physics-based energy function. EJ coordinated the study. YL, IR, SC, and EJ performed the result analysis. All authors read and approved the final manuscript.

Acknowledgements

We acknowledge support from NIH grants 5PN2EY016570-06 and 5R01NS063405-02 and from NSF grants 0835718, 0829382, and 0845702.