<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1472-6807-9-76</ui>
   <ji>1472-6807</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Discriminating the native structure from decoys using scoring functions based on the residue packing in globular proteins</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Bahadur</snm>
               <mnm>Prasad</mnm>
               <fnm>Ranjit</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>ranjitp_bahadur@yahoo.com</email>
            </au>
            <au ca="yes" id="A2">
               <snm>Chakrabarti</snm>
               <fnm>Pinak</fnm>
               <insr iid="I1"/>
               <email>pinak@boseinst.ernet.in</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Biochemistry, Bose Institute, P-1/12 CIT Scheme VIIM, Calcutta 700 054, India</p>
            </ins>
            <ins id="I2">
               <p>Current address: Department of Biotechnology, Indian Institute of Technology, Kharagpur 721302, West Bengal, India</p>
            </ins>
         </insg>
         <source>BMC Structural Biology</source>
         <issn>1472-6807</issn>
         <pubdate>2009</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>76</fpage>
         <url>http://www.biomedcentral.com/1472-6807/9/76</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="doi">10.1186/1472-6807-9-76</pubid>
               <pubid idtype="pmpid">20038291</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>11</day>
               <month>7</month>
               <year>2009</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>28</day>
               <month>12</month>
               <year>2009</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>28</day>
               <month>12</month>
               <year>2009</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2009</year>
         <collab>Bahadur and Chakrabarti; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Setting the rules for the identification of a stable conformation of a protein is of utmost importance for the efficient generation of structures in computer simulation. For structure prediction, a considerable number of possible models are generated from which the best model has to be selected.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Two scoring functions, R<sub>s </sub>and R<sub>p</sub>, based on the consideration of packing of residues, which indicate if the conformation of an amino acid sequence is native-like, are presented. These are defined using the solvent accessible surface area (ASA) and the partner number (PN) (other residues that are within 4.5 &#197;) of a particular residue. The two functions evaluate the deviation from the average packing properties (ASA or PN) of all residues in a polypeptide chain corresponding to a model of its three-dimensional structure. While simple in concept and computationally less intensive, both the functions are at least as efficient as any other energy functions in discriminating the native structure from decoys in a large number of standard decoy sets, as well as on models submitted for the targets of CASP7. R<sub>s </sub>appears to be slightly more effective than R<sub>p</sub>, as determined by the number of times the native structure possesses the minimum value for the function and its separation from the average value for the decoys.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Two parameters, R<sub>s </sub>and R<sub>p</sub>, are discussed that can very efficiently recognize the native fold for a sequence from an ensemble of decoy structures. Unlike many other algorithms that rely on the use of composite scoring function, these are based on a single parameter, viz., the accessible surface area (or the number of residues in contact), but still able to capture the essential attribute of the native fold.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Predicting the native structure of proteins from their amino acid sequences has yet remained an elusive goal. In general this entails the development of effective methods for conformation sampling and the design of an accurate function for structure discrimination <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. The functions could be based on elaborate calculations and analyses of forces between atoms <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>, or be knowledge-based that extract relevant parameters from a database of experimentally determined protein structures <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>. One important area of application of knowledge-based potential functions has been in "protein threading" for the prediction of protein tertiary structure in the absence of detectable sequence homology. The technique involves threading a protein sequence onto the frameworks of known protein folds and finding the most energetically favorable conformation <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>. In addition to fold recognition applications, where the best conformation of a protein is selected from a database of known protein conformations, the knowledge-based scoring functions are also used in protein folding simulations <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>. Many statistical scoring functions assume that frequencies of non-bonded pairs of amino acids follow a Boltzmann-like distribution and the minimum value of the score occurs in the vicinity of the lowest energy structure. Additionally, a set of probability distributions can also be used to construct a scoring function such that it can identify the maximum probability structure.</p>
         <p>For testing of empirical energy functions challenging and diverse datasets of decoy structures that are native-like in properties have been generated <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. Models submitted in the community-wide experiment, CASP (Critical Assessment of techniques for protein Structure Prediction) <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> make up diverse sets of structures resulting from various computational approaches <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. The most native-like structure needs to be identified from among these models <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. An effective potential should be able to distinguish the native structure from decoy structures with a high degree of accuracy. Energy functions based on residue contact or compactness alone do not have enough discriminating power <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, or can rank the native structure highly only when the competing conformations are more random-coil like <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. However, here we present two knowledge-based scoring functions based on the analysis of residue packing in protein structures that are quite robust in discriminating the native conformation from a number of misfolded conformations for a given primary protein sequence. The functions were also tested on ~ 19000 models from server predictions for 71 targets of CASP7 <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. As a descriptor for the residue packing we use the average values of the accessible surface area or the number of other residues in contact around a given residue, calculated from a database of globular proteins. Each of the function then evaluates the cumulative value for the deviation of the parameter for individual residues from the corresponding average value over the whole polypeptide chain. The experimental structure is found to have the minimum deviation and thus the minimum value of the function, when applied to a set of decoys from which the native structure has to be identified. The success of the function indicates that the burial of each residue and its contact to the surrounding residues is optimized during folding and the average values of these parameters can be used as constraint to simulate folding process. Additionally, a surface patch with residues having a large overall deviation of these parameters from the average values may be indicative of the binding region on a protein structure, an issue that would be addressed in future to provide a common perception to both the folding and the binding processes.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <p>Scoring functions have been used to validate X-ray crystal structures, assess and rank three-dimensional models generated for a protein sequence, predict the effect of mutations, etc. Here, we are concerned with the identification of the native structure from decoys. The idea of the use of the discriminatory function originated from the formula of R-factor in crystallography <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. An exact equivalent formula would have meant the use of the expression (1) instead of (3), given in Methods.</p>
         <p>
            <display-formula id="M1">
               <graphic file="1472-6807-9-76-i1.gif"/>
            </display-formula>
         </p>
         <p>The individual term in Eq. (3) involves the absolute difference between the observed and the average values of ASA for a given residue, normalized by the average value. These terms are summed over the whole sequence. In Eq. (1) the numerator and the denominator are summed separately. Some other modified formulae, including the use of the standard deviation on the average values &lt;ASA<sub>x</sub>> in the denominator, were also tried, but (2) and (3) were found most efficient to identify the native structure from a set of decoys. Depending on the structural context larger residues may have a considerable variation in their ASA values in protein structures (as indicated by larger standard deviations, Table <tblr tid="T1">1</tblr>) - normalization of the difference in the numerator in Eq. (3) has the effect of damping the contribution of such residues in the summation.</p>
         <tbl id="T1">
            <title>
               <p>Table 1</p>
            </title>
            <caption>
               <p>Average values of partner number (&lt;PN>) and accessible surface area (&lt;ASA>) of different amino acid residues</p>
            </caption>
            <tblbdy cols="3">
               <r>
                  <c ca="center">
                     <p>
                        <b>Residue</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>&lt;PN></b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>&lt;ASA></b>
                     </p>
                  </c>
               </r>
               <r>
                  <c cspan="3">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Gly</p>
                  </c>
                  <c ca="center">
                     <p>7.4 (2.2)</p>
                  </c>
                  <c ca="center">
                     <p>26.6 (24.5)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Ala</p>
                  </c>
                  <c ca="center">
                     <p>8.6 (2.5)</p>
                  </c>
                  <c ca="center">
                     <p>28.1 (30.9)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Ser</p>
                  </c>
                  <c ca="center">
                     <p>7.9 (2.6)</p>
                  </c>
                  <c ca="center">
                     <p>39.2 (33.2)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Cys</p>
                  </c>
                  <c ca="center">
                     <p>10.0 (2.3)</p>
                  </c>
                  <c ca="center">
                     <p>17.1 (21.0)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Thr</p>
                  </c>
                  <c ca="center">
                     <p>8.5 (2.6)</p>
                  </c>
                  <c ca="center">
                     <p>44.2 (36.0)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Asp</p>
                  </c>
                  <c ca="center">
                     <p>7.9 (2.5)</p>
                  </c>
                  <c ca="center">
                     <p>58.1 (37.2)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Pro</p>
                  </c>
                  <c ca="center">
                     <p>7.7 (2.6)</p>
                  </c>
                  <c ca="center">
                     <p>54.2 (39.5)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Asn</p>
                  </c>
                  <c ca="center">
                     <p>8.3 (2.7)</p>
                  </c>
                  <c ca="center">
                     <p>57.9 (40.8)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Val</p>
                  </c>
                  <c ca="center">
                     <p>10.3 (2.6)</p>
                  </c>
                  <c ca="center">
                     <p>24.1 (32.0)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Glu</p>
                  </c>
                  <c ca="center">
                     <p>8.4 (2.5)</p>
                  </c>
                  <c ca="center">
                     <p>73.4 (41.9)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Gln</p>
                  </c>
                  <c ca="center">
                     <p>9.0 (2.7)</p>
                  </c>
                  <c ca="center">
                     <p>68.6 (43.3)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>His</p>
                  </c>
                  <c ca="center">
                     <p>9.7 (2.9)</p>
                  </c>
                  <c ca="center">
                     <p>53.8 (44.6)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Leu</p>
                  </c>
                  <c ca="center">
                     <p>11.0 (2.7)</p>
                  </c>
                  <c ca="center">
                     <p>28.8 (38.0)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Ile</p>
                  </c>
                  <c ca="center">
                     <p>11.0 (2.7)</p>
                  </c>
                  <c ca="center">
                     <p>25.0 (35.2)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Met</p>
                  </c>
                  <c ca="center">
                     <p>11.2 (3.1)</p>
                  </c>
                  <c ca="center">
                     <p>35.5 (45.8)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Lys</p>
                  </c>
                  <c ca="center">
                     <p>8.4 (2.4)</p>
                  </c>
                  <c ca="center">
                     <p>95.8 (42.9)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Phe</p>
                  </c>
                  <c ca="center">
                     <p>11.9 (2.9)</p>
                  </c>
                  <c ca="center">
                     <p>31.0 (39.8)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Tyr</p>
                  </c>
                  <c ca="center">
                     <p>11.5 (3.1)</p>
                  </c>
                  <c ca="center">
                     <p>45.5 (45.0)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Arg</p>
                  </c>
                  <c ca="center">
                     <p>10.1 (3.1)</p>
                  </c>
                  <c ca="center">
                     <p>85.5 (53.3)</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>Trp</p>
                  </c>
                  <c ca="center">
                     <p>12.6 (3.2)</p>
                  </c>
                  <c ca="center">
                     <p>43.5 (47.6)</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>Data taken from <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. The standard deviations are in parenthesis.</p>
            </tblfn>
         </tbl>
         <sec>
            <st>
               <p>Quantification of the overall packing of residues in protein structures</p>
            </st>
            <p>The average number of partner residues and the average accessible surface area for all twenty amino acids are provided in Table <tblr tid="T1">1</tblr><abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. While the &lt;ASA> values are almost identical to those calculated earlier <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>, the values for the partner number are different, as the calculation is residue-based here, while in the earlier study the individual atoms constituted the partners.</p>
            <p>As R<sub>p </sub>and R<sub>s </sub>indicate the extent of deviation of PN and ASA of residues from their average values, taken over the whole structure, these parameters can be used to judge the optimization of packing of residues in a structure <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. We also wanted to see if there is any variation depending on the class of protein. However, as R<sub>p </sub>and R<sub>s </sub>provide cumulative values over all the residues in a structure, it is sensible to divide them by the number of residues in a structure before comparison. Individual protein structures in the dataset were classified according to CATH (Class, Architecture, Topology, Homologous superfamily; <url>http://www.cathdb.info/index.html</url>) into 157 all-&#945;, 142 all-&#946; and 133 &#945;&#946; (including &#945;+&#946; and &#945;/&#946;) classes of proteins. The normalized values (Table <tblr tid="T2">2</tblr>) are rather similar, except slightly higher values in the all-&#946; class, indicating somewhat higher deviations from the optimum values of PN and ASA in these structures. The observation of higher values in &#946;-proteins is in tune with a relatively lesser packing efficiency in these proteins, as is also demonstrated by the higher occurrence of cavities involving residues in &#946;-sheets <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Average values of R<sub>s </sub>and R<sub>p </sub>in various protein structural classes<sup>a</sup></p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Number of structures</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>R<sub>s</sub></b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>R<sub>p</sub></b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>All-&#945;</p>
                     </c>
                     <c ca="center">
                        <p>157</p>
                     </c>
                     <c ca="left">
                        <p>112 (63)</p>
                     </c>
                     <c ca="left">
                        <p>30 (20)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0.92 (0.86)</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0.27 (0.27)</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>All-&#946;</p>
                     </c>
                     <c ca="center">
                        <p>142</p>
                     </c>
                     <c ca="left">
                        <p>115 (60)</p>
                     </c>
                     <c ca="left">
                        <p>31 (19)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <it>1.26 (1.01)</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0.37 (0.33)</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>&#945;&#946; <sup>b</sup></p>
                     </c>
                     <c ca="center">
                        <p>133</p>
                     </c>
                     <c ca="left">
                        <p>149 (70)</p>
                     </c>
                     <c ca="left">
                        <p>42 (23)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0.72 (0.55)</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0.23 (0.18)</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Overall</p>
                     </c>
                     <c ca="center">
                        <p>432</p>
                     </c>
                     <c ca="left">
                        <p>143 (91)</p>
                     </c>
                     <c ca="left">
                        <p>39 (23)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0.83 (0.76)</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0.22 (0.17)</it>
                        </p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>a</sup>According to CATH <abbrgrp><abbr bid="B59">59</abbr></abbrgrp>. <sup>b</sup>Including a/&#946; and a+&#946;.</p>
                  <p>Standard deviations are in parentheses. Normalized (dividing the values obtained from equations (2) and (3) by the number of residues) values are given in italics.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Identification of the native structure from misfolded decoys</p>
            </st>
            <sec>
               <st>
                  <p>PROSTAR decoy sets</p>
               </st>
               <p>The objective of this work is to discriminate between the native structure and one or more misfolded or low-resolution structures. The utility of R<sub>p </sub>and R<sub>s </sub>was tested on the decoy sets in the PROSTAR website and the results are shown in Table <tblr tid="T3">3</tblr>. When compared with other atomic or residue-based potentials, the present parameters, R<sub>s </sub>and R<sub>p </sub>have similar or better performance, except for 'Ifu'. Of the two parameters, R<sub>s </sub>based on residue accessibility performs better than the one derived on the basis of partner number (R<sub>p</sub>).</p>
               <tbl id="T3">
                  <title>
                     <p>Table 3</p>
                  </title>
                  <caption>
                     <p>Identification of the native structure from decoys in PROSTAR decoy sets using different scoring functions<sup>a</sup></p>
                  </caption>
                  <tblbdy cols="5">
                     <r>
                        <c ca="left">
                           <p>
                              <b>Parameters</b>
                           </p>
                        </c>
                        <c ca="center">
                           <p>
                              <b>Misfold</b>
                           </p>
                        </c>
                        <c ca="center">
                           <p>
                              <b>Ifu</b>
                           </p>
                        </c>
                        <c ca="center">
                           <p>
                              <b>Asilomar</b>
                           </p>
                        </c>
                        <c ca="center">
                           <p>
                              <b>Pdberr and sgpa</b>
                           </p>
                        </c>
                     </r>
                     <r>
                        <c cspan="5">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>R<sub>s </sub><sup>b</sup></p>
                        </c>
                        <c ca="center">
                           <p>24/24</p>
                        </c>
                        <c ca="center">
                           <p>22/43</p>
                        </c>
                        <c ca="center">
                           <p>41/41</p>
                        </c>
                        <c ca="center">
                           <p>5/5</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>R<sub>p </sub><sup>b</sup></p>
                        </c>
                        <c ca="center">
                           <p>20/24</p>
                        </c>
                        <c ca="center">
                           <p>21/43</p>
                        </c>
                        <c ca="center">
                           <p>36/41</p>
                        </c>
                        <c ca="center">
                           <p>5/5</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Atomic KBP<sup>c</sup></p>
                        </c>
                        <c ca="center">
                           <p>24/24</p>
                        </c>
                        <c ca="center">
                           <p>32/43</p>
                        </c>
                        <c ca="center">
                           <p>37/41</p>
                        </c>
                        <c ca="center">
                           <p>5/5</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>RAPDF<sup>d</sup></p>
                        </c>
                        <c ca="center">
                           <p>24/24</p>
                        </c>
                        <c ca="center">
                           <p>30/43</p>
                        </c>
                        <c ca="center">
                           <p>37/41</p>
                        </c>
                        <c ca="center">
                           <p>5/5</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>CDF<sup>d</sup></p>
                        </c>
                        <c ca="center">
                           <p>19/24</p>
                        </c>
                        <c ca="center">
                           <p>21/43</p>
                        </c>
                        <c ca="center">
                           <p>35/41</p>
                        </c>
                        <c ca="center">
                           <p>5/5</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Residue contact potential<sup>e</sup></p>
                        </c>
                        <c ca="center">
                           <p>24/24</p>
                        </c>
                        <c ca="center">
                           <p>22/43</p>
                        </c>
                        <c ca="center">
                           <p>35/41</p>
                        </c>
                        <c ca="center">
                           <p>4/5</p>
                        </c>
                     </r>
                  </tblbdy>
                  <tblfn>
                     <p>PROSTAR website <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>.</p>
                     <p><sup>a</sup>The first number of each column is the number of correctly identified decoys, and the second one after the slash is the total number of decoys. With either of the first two parameters the native structure is correctly identified if its value is smaller than that from any other structure in the decoy set. The results with the other parameters are taken from <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>.</p>
                     <p><sup>b</sup>The parameters developed in this study.</p>
                     <p><sup>c</sup>The atomic Knowledge-Based Potential from Lu and Skolnick <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>.</p>
                     <p><sup>d</sup>RAPDF and CDF are atomic and residue-based potentials, respectively, from Samudrala and Moult <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
                     <p><sup>e</sup>Residue-based quasichemical potential from Skolnick et al<abbrgrp><abbr bid="B33">33</abbr></abbrgrp>.</p>
                  </tblfn>
               </tbl>
               <p>The 'Misfold' decoy set, generated by Holm and Sander <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, consists of 24 examples of pairs of proteins with the same number of residues in the chain, but different sequences and conformations. Sequences are swapped between members of a pair, resulting in rather inappropriate environments for most of the side chains. For this set, R<sub>s </sub>selects 100% of the structures correctly, but R<sub>p </sub>fails in four. Attempts were made to see if the use of other cut-off distances (4.0, 5.0, 6.0 and 7.0 &#197;) in the definition of R<sub>p </sub>improved the situation, but the performance of the parameter derived at 4.5 &#197; was found to be the best.</p>
               <p>The 'Ifu' decoy set is based on a set of 43 peptides, 10-20 residues long, which are proposed to be independent folding units as determined by local hydrophobic burial and experimental evidence <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. In this test set, R<sub>s </sub>and R<sub>p </sub>were unsuccessful to pick 21 and 22, respectively, out of 43 native structures. While performing the best, even the knowledge-based potential <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> failed in 11 cases in this test set. This is probably because the targets in these subsets are protein pieces and it is difficult for residue packing parameters derived from larger proteins to evaluate these structures.</p>
               <p>The 'Asilomar' decoy set resulted from the first experiment on the Critical Assessment of Protein Structure Prediction methods (CASP), which produced a set of 41 comparative models of six different proteins <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. The models vary in C<sup>&#945; </sup>rmsd to the corresponding experimental conformation, ranging from 0.53 to 7.40 &#197;, depending on the difficulty of the model building process. In this test set, the parameter R<sub>s </sub>selects 100% native structures correctly, by far the best result from any discriminatory function. For R<sub>p</sub>, missing 5 out of 41 cases, the performance is at par with other functions.</p>
               <p>The 'Pdberr' decoy set consists of structures determined using X-ray crystallography that were later found to contain errors, and the corresponding corrected experimental conformations <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. The 'sgpa' decoy set consists of the experimental structure <it>Streptomyces griseus </it>Protease A (<ext-link ext-link-id="2sga" ext-link-type="pdb">2sga</ext-link>) and two conformations generated by molecular dynamics simulations starting with the experimental structure <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. In these test sets, where the decoys are low-resolution X-ray structures, both the scoring functions R<sub>s </sub>and R<sub>p </sub>correctly picked the high-resolution structures in all cases, as did all other potential functions, except the one based on the residue contact potential with a composition-corrected scale <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>.</p>
            </sec>
            <sec>
               <st>
                  <p>Park and Levitt decoy set</p>
               </st>
               <p>The Park and Levitt decoy test set, available on the web site <url>http://dd.compbio.washington.edu</url>, consists of 7 sequences, each with nearly 600-700 decoys that cover structures showing an rmsd ranging from 0 (the correct fold) to 10 &#197; from the native structure <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. The protein structures were generated by using four-state models (four discrete &#981;,&#968; angles) to define the conformation of each of ten selected residues in each protein using an off-lattice model. From the very large number of conformations generated, only those compact structures were retained that scored well using a variety of scoring functions, as well as having a reasonable rmsd from the native structure. The 4-state-reduced decoy data set given in Additional file <supplr sid="S1">1</supplr>: Table S1 includes a range of small proteins from 54-75 residues with varying topological folds, with the numbers of decoys ranging from 630 for <ext-link ext-link-id="1ctf" ext-link-type="pdb">1ctf</ext-link> to 687 for <ext-link ext-link-id="4pti" ext-link-type="pdb">4pti</ext-link>. A positive Z-score (Equations (4) and (5)) indicates that the value of the parameter for a particular native fold is lower than the average of the distribution. While considering the Z<sub>s</sub>, the native structure is well separated from the average of the distribution for all the structures, but Z<sub>p </sub>shows an inferior result for <ext-link ext-link-id="1r69" ext-link-type="pdb">1r69</ext-link> and <ext-link ext-link-id="1sn3" ext-link-type="pdb">1sn3</ext-link>. Figure <figr fid="F1">1</figr> plots R<sub>s </sub>vs rmsd for a representative dataset corresponding to the PDB file, <ext-link ext-link-id="1ctf" ext-link-type="pdb">1ctf</ext-link>. The value of R<sub>s </sub>is the minimum for the native structure. There is a good linear correlation between the two variables (R<sup>2 </sup>is 0.78), better than that (0.6) obtained using the knowledge-based potential of Lu and Skolnick <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. While the various energy functions based on empirical contact, surface area and van der Waals energy did not perform consistently well to distinguish between correct and incorrect conformations and had to be used in combination for the proper identification of the correct fold <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, the rather simple parameter, R<sub>s </sub>has a remarkable discriminatory power.</p>
               <suppl id="S1">
                  <title>
                     <p>Additional file 1</p>
                  </title>
                  <text>
                     <p><b>Identification of native structure from decoys in different decoy sets</b>. The file contains three tables, numbered S1 to S3.</p>
                  </text>
                  <file name="1472-6807-9-76-S1.DOC">
                     <p>Click here for file</p>
                  </file>
               </suppl>
               <fig id="F1">
                  <title>
                     <p>Figure 1</p>
                  </title>
                  <caption>
                     <p>Scatter plot of R<sub>s </sub>vs. rmsd for a representative protein structure, <ext-link ext-link-type="pdb" ext-link-id="1ctf">1ctf</ext-link>, along with its decoys</p>
                  </caption>
                  <text>
                     <p><b>Scatter plot of R<sub>s </sub>vs. rmsd for a representative protein structure, </b><ext-link ext-link-type="pdb" ext-link-id="1ctf">1ctf</ext-link>, <b>along with its decoys</b>.</p>
                  </text>
                  <graphic file="1472-6807-9-76-1"/>
               </fig>
               <p>The Levitt low-minima decoy sets (LMDS) also contain structural decoys (the number ranging from 343 to 500) for 7 small proteins, 36 to 68 residues long <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. From an initial ten thousand structures, generated by randomly modifying only the loop dihedral angles, which were subjected to minimization using a modified ENCAD force field involving united and soft atoms <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>, up to five hundred of the lowest energy conformations were retained to make up the decoy sets. For all the 7 cases the native structure has the minimum R<sub>s </sub>value and the corresponding Z-score indicates that it is well separated from the decoys (Additional file <supplr sid="S1">1</supplr>: Table S1). However, Z<sub>p </sub>gives an inferior result for <ext-link ext-link-id="1bba" ext-link-type="pdb">1bba</ext-link> and <ext-link ext-link-id="1fc2" ext-link-type="pdb">1fc2</ext-link>. Other energy functions also failed to identify the native structure for these two proteins <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B22">22</abbr></abbrgrp> due to the fact that the native conformation is simply not very well defined for the former <abbrgrp><abbr bid="B35">35</abbr></abbrgrp> and the latter is a fragment of a larger protein and additionally, a constituent of a complex, and in the unbound form may have a structure different from that in the complex <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. Interestingly however, based on R<sub>s </sub>both the native structures are separated by about two standard deviations from the average of the distribution.</p>
            </sec>
            <sec>
               <st>
                  <p>ROSETTA decoy sets</p>
               </st>
               <p>The ROSETTA all-atom decoy sets are composed of five different proteins ranging in size from 92 to 116 residues, and the number of decoys ranging from 994 to 999 (Additional file <supplr sid="S1">1</supplr>: Table S1) <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. Fragments, between 3 and 9 residues, from known structures matched to the targets through a multiple sequence alignment process, were assembled into the protein structures via the fragment insertion-simulated annealing strategy <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. The scoring functions used to select the lowest energy decoys included hydrophobic burial, electrostatics, the formation of &#946;-sheets and the packing of &#945;-helices and &#946;-strands. The Z-scores based on R<sub>s </sub>and R<sub>p </sub>indicate that both the scoring functions perform well over all the 5 structures. The large Z-scores seen here, as compared to those in others, should be due to the high rmsds in the decoys used in this test set.</p>
               <p>The original ROSETTA decoy set has been improved by increasing the number of proteins and frequency of near native models, providing 1,400 model structures for 78 diverse, single domain proteins with varying degrees of secondary structure and length from 25 to 87 residues for the evaluation of scoring functions <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. The discriminatory ability of our scoring functions can be seen from the results on 41 cases (a subset of the complete dataset, which is downloadable) presented in Additional file <supplr sid="S1">1</supplr>: Table S2. The native structure did not have the minimum R<sub>s </sub>value in 3 cases, while R<sub>p </sub>failed in two additional cases. For these, the Z-score is also quite small, Z<sub>p </sub>even registering a negative value in two. It may be noted that two structures (<ext-link ext-link-id="1res" ext-link-type="pdb">1res</ext-link> and <ext-link ext-link-id="1uxd" ext-link-type="pdb">1uxd</ext-link>) among the failed cases were derived from NMR experiments and the Rosetta energy functions are also less efficient in identifying the NMR structures as compared to X-ray crystal structures, probably because the former structures have greater deviation of side chain conformations from the canonical rotamer conformations <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>.</p>
            </sec>
            <sec>
               <st>
                  <p>Identification of the native structure from the native-like conformation constructed by homology modeling</p>
               </st>
               <p>Samudrala and Levitt <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> have a decoy set (hg_structal) for 29 globins. Each globin has been built by comparative modeling using 29 other globins as templates with the program segmod <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>; the rmsd for the modeled structures range from 1.96 to 8.57 &#197;. A similar decoy set (ig_structal_hires) involving 20 immunoglobulins and at a relatively higher resolution (1.7-2.2 &#197;, compared to the range of 1.7-3.1 &#197; for the full set of 61 proteins) is also available. The application of our scoring function on these two sets yields results given in Table <tblr tid="T4">4</tblr>. As with the other decoy sets, R<sub>s </sub>performs better than R<sub>p </sub>in identifying the native structure. Even though the homology built models in the 'ig_structal_hires' set are very close to the native structure, the latter was identifiable in 90% of the cases.</p>
               <tbl id="T4">
                  <title>
                     <p>Table 4</p>
                  </title>
                  <caption>
                     <p>Identification of native structure from decoys constructed by homology modeling</p>
                  </caption>
                  <tblbdy cols="3">
                     <r>
                        <c ca="left">
                           <p>
                              <b>Parameters</b>
                           </p>
                        </c>
                        <c ca="left">
                           <p>
                              <b>hg_structal</b>
                           </p>
                        </c>
                        <c ca="left">
                           <p>
                              <b>ig_structal_hires</b>
                           </p>
                        </c>
                     </r>
                     <r>
                        <c cspan="3">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>R<sub>s</sub></p>
                        </c>
                        <c ca="left">
                           <p>23/29</p>
                        </c>
                        <c ca="left">
                           <p>18/20</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>R<sub>p</sub></p>
                        </c>
                        <c ca="left">
                           <p>15/29</p>
                        </c>
                        <c ca="left">
                           <p>17/20</p>
                        </c>
                     </r>
                  </tblbdy>
                  <tblfn>
                     <p>Dataset taken from <abbrgrp><abbr bid="B19">19</abbr></abbrgrp><url>http://dd.compbio.washington.edu</url>. The first column in each category is the number of correctly identified decoys, and the second column is the total number of decoys.</p>
                  </tblfn>
               </tbl>
            </sec>
            <sec>
               <st>
                  <p>Score of the experimental structure relative to the solutions submitted to CASP7</p>
               </st>
               <p>The ability of our scoring function to identify the native structure from the best near-native solutions has been tested on the CASP7 dataset <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. This is the most difficult test as the decoys are the best predicted near-native structures submitted by different groups participating in the CASP experiment. CASP7 experiment consists of 95 accepted targets for which about 22000 models were submitted. We have excluded the NMR structures and retained 71 targets (with ~ 19000 models) to evaluate our scoring functions (Additional file <supplr sid="S1">1</supplr>: Table S3). The rmsd between the native structure and the best predicted solution varies in the range 0.4 - 2.6 &#197; in the whole dataset. Z<sub>s </sub>identifies the best solution in 51 cases and Z<sub>p </sub>in 38. Table <tblr tid="T5">5</tblr> compares the results of our study vis-&#224;-vis those from other algorithms <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. As we have seen before, R<sub>s </sub>performs better than R<sub>p</sub>. But even R<sub>p </sub>outperforms other existing functions in locating the native structure among the top ten solutions. R<sub>s </sub>identifies the native structure as the top solution in 72% of cases, which is considerably better than the next best performer (DFIRE and QMEAN3) at 62%.</p>
               <tbl id="T5">
                  <title>
                     <p>Table 5</p>
                  </title>
                  <caption>
                     <p>Performance of the different scoring function for predicting the native structure among the best near-native structures submitted in CASP7</p>
                  </caption>
                  <tblbdy cols="4">
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c cspan="2" ca="center">
                           <p>
                              <b>% of the native structure<sup>b</sup></b>
                           </p>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c>
                           <p/>
                        </c>
                        <c cspan="2">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>
                              <b>Method<sup>a</sup></b>
                           </p>
                        </c>
                        <c ca="left">
                           <p>
                              <b>Z<sub>nat</sub></b>
                           </p>
                        </c>
                        <c ca="center">
                           <p>
                              <b>Rank1</b>
                           </p>
                        </c>
                        <c ca="center">
                           <p>
                              <b>Rank10</b>
                           </p>
                        </c>
                     </r>
                     <r>
                        <c cspan="4">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Modcheck</p>
                        </c>
                        <c ca="left">
                           <p>1.99</p>
                        </c>
                        <c ca="center">
                           <p>49.47</p>
                        </c>
                        <c ca="center">
                           <p>72.63</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>RAPDF</p>
                        </c>
                        <c ca="left">
                           <p>-2.09</p>
                        </c>
                        <c ca="center">
                           <p>57.89</p>
                        </c>
                        <c ca="center">
                           <p>81.05</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>DFIRE</p>
                        </c>
                        <c ca="left">
                           <p>-1.25</p>
                        </c>
                        <c ca="center">
                           <p>62.11</p>
                        </c>
                        <c ca="center">
                           <p>75.79</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>ProQ</p>
                        </c>
                        <c ca="left">
                           <p>1.51</p>
                        </c>
                        <c ca="center">
                           <p>9.47</p>
                        </c>
                        <c ca="center">
                           <p>33.68</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>ProQ_SSE</p>
                        </c>
                        <c ca="left">
                           <p>1.76</p>
                        </c>
                        <c ca="center">
                           <p>14.74</p>
                        </c>
                        <c ca="center">
                           <p>44.21</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>FRST</p>
                        </c>
                        <c ca="left">
                           <p>-2.41</p>
                        </c>
                        <c ca="center">
                           <p>58.95</p>
                        </c>
                        <c ca="center">
                           <p>75.79</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>QMEAN3</p>
                        </c>
                        <c ca="left">
                           <p>-2.27</p>
                        </c>
                        <c ca="center">
                           <p>62.11</p>
                        </c>
                        <c ca="center">
                           <p>78.95</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>R<sub>p</sub></p>
                        </c>
                        <c ca="left">
                           <p>1.69</p>
                        </c>
                        <c ca="center">
                           <p>53.52</p>
                        </c>
                        <c ca="center">
                           <p>91.55</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>R<sub>s</sub></p>
                        </c>
                        <c ca="left">
                           <p>2.17</p>
                        </c>
                        <c ca="center">
                           <p>71.83</p>
                        </c>
                        <c ca="center">
                           <p>98.59</p>
                        </c>
                     </r>
                  </tblbdy>
                  <tblfn>
                     <p>Z<sub>nat </sub>corresponds to the average Z-score of the native structure.</p>
                     <p><sup>a </sup>Except the last two functions, the performance of others are based on the data provided in Table 6 of <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>.</p>
                     <p><sup>b </sup>% of the native structure with rank 1 or within rank 10 from among all the solutions submitted in CASP7.</p>
                  </tblfn>
               </tbl>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>There are many energy functions (knowledge based statistical scoring function or physics-based or a combination of both) which find the correct native conformation from misfolded decoys <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B6">6</abbr><abbr bid="B9">9</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B22">22</abbr><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr></abbrgrp>. However, it is rather nontrivial to develop a function that works across different decoy sets and a combination of functions is normally used <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. R-factor is the gold-standard for expressing the accuracy of crystallographic analysis, and as knowledge-based functions are mostly "trained" on crystal structures it is rather gratifying to develop functions similar to R-factor that can also be used to characterize the native structure (Table <tblr tid="T2">2</tblr>).</p>
         <p>The present study demonstrates the development of scoring functions from the properties of residue packing that can be useful for discriminating the native conformation from various misfolded conformations for a given protein sequence. The algorithm assumes that a protein tries to take up a fold that has the minimum deviation of ASA (or PN) of each residue from the average value observed over all protein structures. The function R<sub>s</sub>, based on residue accessibility, performs better than the one derived from the partner number, R<sub>p</sub>, on decoy sets. The test on various decoy sets from the PROSTAR website demonstrated that the knowledge based scoring function developed in this study performs better or even at least of the same order than those previously derived by many authors <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>. Not only the present knowledge-based scoring functions pick the correct native structure in most cases, but the discrimination ratio is also better than that of the other potentials. However, as Equations (2) and (3) use the average values derived from a database of globular proteins, it is not likely to be very discriminatory for small proteins or peptides (as seen for the 'Ifu' set in Table <tblr tid="T3">3</tblr>). As such it would not be useful for checking local model quality in protein structures, as done by packages such as PROSA <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. Along the same line it may be mentioned that the Verify3D server <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> for the visual analysis of the quality of a crystal structure works best on proteins with at least 100 residues.</p>
         <p>The Park and Levitt decoy set had been shown to be quite a challenging dataset where the lowest-energy structures typically were 6-10 &#197; rmsd away from native ones <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. The improved residue-based potential <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> also cannot recognize the native and near-native structures in all cases. The knowledge based scoring functions derived in this study are quite efficient to identify the near-native fold in Park and Levitt decoy sets. The correlation between the scoring function and rmsd is good in all cases and most of the cases the scoring functions have minimum value for the native structure. The scoring functions perform well also in the PROSTAR decoy sets, Levit's Local-Minima Decoy Sets (LMDS) and also in ROSETTA All-atom Decoy Sets. Considering 222 independent cases considered in this analysis R<sub>s </sub>and R<sub>p </sub>can efficiently discriminate native structures from all their corresponding decoys with a success rate greater than 85% and 74%, respectively. If we do not consider the 'Ifu' dataset, which comprises of small fragments of polypeptide chains, the success rate increases to 94% and 80%, respectively. The most rigorous test of a scoring function is to evaluate its performance in identifying the native structure with reference to the models submitted in CASP7 experiment. Even here, both R<sub>s </sub>and R<sub>p</sub>, the former in particular, stand out from all other methods (Table <tblr tid="T5">5</tblr>).</p>
         <p>As our scoring functions depend on ASA or PN, these should be closely related to potentials of mean force derived from solvation or packing considerations. The performance of these potentials, however, depend critically on how the standard state is specified <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B12">12</abbr><abbr bid="B23">23</abbr></abbrgrp>. As the core and surface regions in proteins constitute distinct environments, potentials are sometimes divided into two parts, for the buried and the solvent-accessible regions <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. The use of the average values of ASA or PN in globular proteins seems to have eliminated the need of such division, or the debate on the proper choice of the standard state.</p>
         <p>A discussion on the uniqueness of our parameters vis-&#224;-vis other knowledge-based discrimination functions is in order. First, a residue in the sequence is normally represented in these functions with one or two positions in the three-dimensional space and one or more of its properties, such as the secondary structure or backbone dihedral angle preferences, features in distance or sequence separation from other residues, etc. are considered <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B23">23</abbr></abbrgrp>. With such a coarse representation the function may not be as efficient as an all-atom discriminatory function, which takes into account the environment of all the atoms in a residue <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr></abbrgrp>. An all-atom representation is implicit in our method, as all the atoms are needed for the calculation of ASA or the partner number. However, each residue in the sequence contributes singly to the derivation of R<sub>s </sub>or R<sub>p</sub>. This is also in contrast to residue-residue interaction energy for each residue pair that is normally employed in other functions <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B48">48</abbr><abbr bid="B49">49</abbr></abbrgrp>. Furthermore, residue triplets and four-body contact potentials have also been developed <abbrgrp><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr></abbrgrp>. Secondly, the energy functions are generally less discriminatory when used individually, and the use of the hybrid scoring function is the norm for an enhanced performance <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B16">16</abbr><abbr bid="B22">22</abbr></abbrgrp>. While conceptually simple, R<sub>s </sub>or R<sub>p </sub>can work as efficiently. Thirdly, most formulations use energy as the criterion (with the assumption that the native structure is at a global free-energy minimum), while our function seeks to find the conformation that has the minimum deviation from the average value of the partner number or ASA. This way the selection of the most compact state of the polypeptide chain corresponding to a given sequence is achieved. The parameters are less likely to be fooled by over-abundance (which is penalized to the same extent as lower-abundance in equations 2 and 3) of contacts, as is the case with some functions <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. Lastly, as the functions can identify the correct structure from the erroneous ones modeled from X-ray data ('Pdberr' set in Table <tblr tid="T3">3</tblr>) and vary within a narrow range in different protein classes (Table <tblr tid="T2">2</tblr>), these can be used for the validation of the structure determined crystallographically <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>.</p>
         <p>The functions developed here can also be used to delineate the compatibility of the sequence to a fold For example, azurin <abbrgrp><abbr bid="B53">53</abbr></abbrgrp> and plastocyanin <abbrgrp><abbr bid="B54">54</abbr></abbrgrp> are two small proteins having the same fold (a sandwich of two &#946;-sheets having seven strands), but sequence identity of only 17% over an aligned length of 86 residues (Table <tblr tid="T6">6</tblr>). Expectedly, they have very similar R<sub>s </sub>and R<sub>p </sub>values. More interestingly however, when the sequence of plastocyanin is considered over the structure of azurin one gets a value of 0.97 for (R<sub>s</sub>)<sub>azu/pcy</sub>, quite close to 0.89 obtained for the reverse process ((R<sub>s</sub>)<sub>pcy/azu</sub>), thereby indicating the compatibility of the two sequences to the same fold.</p>
         <tbl id="T6">
            <title>
               <p>Table 6</p>
            </title>
            <caption>
               <p>R<sub>s </sub>and R<sub>p </sub>for two proteins having the same fold belonging to the &#946; class</p>
            </caption>
            <tblbdy cols="5">
               <r>
                  <c ca="left">
                     <p>
                        <b>Name of the protein</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>Number of residues</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>Number of aligned residues</b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>R<sub>s</sub></b>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <b>R<sub>p</sub></b>
                     </p>
                  </c>
               </r>
               <r>
                  <c cspan="5">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Azurin (<ext-link ext-link-type="pdb" ext-link-id="1azu">1azu</ext-link>)</p>
                  </c>
                  <c ca="center">
                     <p>126</p>
                  </c>
                  <c ca="center">
                     <p>84</p>
                  </c>
                  <c ca="center">
                     <p>1.12</p>
                     <p>
                        <it>(1.06)</it>
                     </p>
                  </c>
                  <c ca="center">
                     <p>0.33</p>
                     <p>
                        <it>(0.29)</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Plastocyanin (<ext-link ext-link-type="pdb" ext-link-id="5pcy">5pcy</ext-link>)</p>
                  </c>
                  <c ca="center">
                     <p>99</p>
                  </c>
                  <c ca="center">
                     <p>84</p>
                  </c>
                  <c ca="center">
                     <p>1.33</p>
                     <p>
                        <it>(1.14)</it>
                     </p>
                  </c>
                  <c ca="center">
                     <p>0.46</p>
                     <p>
                        <it>(0.32)</it>
                     </p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>The structures are aligned using the software SSM at EBI <url>http://www.ebi.ac.uk/msd-srv/ssm</url>. The values calculated considering only the aligned amino acid residues are given in parenthesis. To quantify the sequence structure compatibility between the structures, two more parameters are computed over the aligned residues. (R<sub>s</sub>)<sub>azu/pcy </sub>= 0.97 and (R<sub>s</sub>)<sub>pcy/azu </sub>= 0.89. Each term contributing to the former corresponds to (ASA<sub>azu </sub>- &lt;ASA><sub>pcy</sub>)/&lt;ASA><sub>pcy</sub>, i.e., in Eq. (3) the observed value at a given position in the structure of azurin is compared to the average value corresponding to the aligned residue type at the same position in the sequence of plastocyanin. The opposite is done in the calculation of (R<sub>s</sub>)<sub>pcy/azu</sub>.</p>
            </tblfn>
         </tbl>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>This work demonstrates the effectiveness of a simple knowledge-based scoring function derived from residue packing for discriminating the native structures from a large set of decoys constructed by several groups. This knowledge-based scoring scheme is simple to derive and less computationally intensive than other energy functions and the performance is better (or at least at par) compared to others. Used in conjunction with other chemically intuitive parameter that captures the essence of the protein structure, it should be possible to achieve complete discrimination between the native structure and decoys.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>Atomic coordinates were obtained from the Protein Data Bank (PDB) <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>. The analysis was carried out using the dataset of 432 polypeptide chains in 418 PDB files (given in <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>) with an <it>R</it>-factor &#8804; 20%, a resolution &#8804; 2.0 &#197; and sequence identity &lt; 25%. Also the polypeptide chains with >40% of atoms with temperature factor (<it>B</it>-factor) >30 &#197;<sup>2 </sup>were excluded. The calculation of the partner number was restricted only to the well-ordered residues by excluding those with >40% atoms with temperature factor >30 &#197;<sup>2</sup>. The solvent accessible surface area (ASA) was computed using the program NACCESS <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>, which is an implementation of the Lee and Richards algorithm <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>. The partner number of a residue is the number of other residues within a distance of 4.5 &#197; from any atom of the residue under consideration; the flanking residues were not considered as partner if the interaction was only with the main-chain atoms. The reason for the selection of the particular threshold value for the distance has been discussed <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B58">58</abbr></abbrgrp>. To be identified as a partner it is enough if just a pair of atoms is in contact.</p>
         <p>Two parameters R<sub>p </sub>and R<sub>s </sub>based on the observed partner number and the accessibility at a given position in the protein sequence, as compared to the average value of the parameters for the same residue type in the whole database, were developed as given in the following two equations</p>
         <p>
            <display-formula id="M2">
               <graphic file="1472-6807-9-76-i2.gif"/>
            </display-formula>
         </p>
         <p>
            <display-formula id="M3">
               <graphic file="1472-6807-9-76-i3.gif"/>
            </display-formula>
         </p>
         <p>where PN<sub>xi </sub>and ASA<sub>xi </sub>are the observed partner number and the solvent accessible surface area, respectively, for a residues of type x occurring at a particular position, i, in a PDB file and &lt;PN<sub>x</sub>> and &lt;ASA<sub>x</sub>> are the average values of the residue type x in the analyzed dataset. Considering (3), the function sums up the absolute value of the deviation of ASA at each position in the sequence from the average ASA of the residue type, each term being normalized by the average ASA value. The magnitude of each of the two parameters derived using (2) and (3) is used to discriminate the near native fold from the misfolded decoys. For the correct fold the values of these two parameters should be minimum.</p>
         <p>A number of decoy datasets have been used from literature, the details of which are provided in Results. The Z-score of a native structure and the misfolded decoys was also evaluated. The Z-scores using the residue accessibility (Z<sub>s</sub>) and residue partner number (Z<sub>p</sub>) of a particular protein conformation are defined by the following equations</p>
         <p>
            <display-formula id="M4">
               <graphic file="1472-6807-9-76-i4.gif"/>
            </display-formula>
         </p>
         <p>
            <display-formula id="M5">
               <graphic file="1472-6807-9-76-i5.gif"/>
            </display-formula>
         </p>
         <p>where R<sub>s-nat </sub>(or R<sub>p-nat</sub>) is the value of the parameter for the native conformation, and &lt;R<sub>s</sub>> (&lt;R<sub>p</sub>>) and &#963; are the average and the standard deviation of the distribution of the parameter in the set. The magnitude of the Z-score is an indication of how far that native conformation is separated from the near native structures in the distribution.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>PC conceptualized the work that was carried out by RPB. RPB and PC participated in interpretation of the data and writing the manuscript. Both the authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We are grateful to the anonymous reviewers for their comments on the manuscript. The work was supported by a grant from the Department of Biotechnology, India. RPB thanks SRIC of IIT, Kharagpur for a startup grant.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Toward high-resolution de novo structure prediction for small proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Bradley</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Misura</snm>
                  <fnm>KMS</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2005</pubdate>
            <volume>309</volume>
            <fpage>1868</fpage>
            <lpage>1871</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1113801</pubid>
                  <pubid idtype="pmpid" link="fulltext">16166519</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Macromolecular modeling with Rosetta</p>
            </title>
            <aug>
               <au>
                  <snm>Das</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Annu Rev Biochem</source>
            <pubdate>2008</pubdate>
            <volume>77</volume>
            <fpage>363</fpage>
            <lpage>382</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.biochem.77.062906.171838</pubid>
                  <pubid idtype="pmpid" link="fulltext">18410248</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Effective energy functions for protein structure prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Lazaridis</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Karplus</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>139</fpage>
            <lpage>145</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-440X(00)00063-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">10753811</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Protein model refinement using an optimized physics-based all-atom force field</p>
            </title>
            <aug>
               <au>
                  <snm>Jagielska</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wroblewska</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Skolnick</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2008</pubdate>
            <volume>105</volume>
            <fpage>8268</fpage>
            <lpage>8273</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.0800054105</pubid>
                  <pubid idtype="pmcid">2448826</pubid>
                  <pubid idtype="pmpid" link="fulltext">18550813</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Generating and testing protein folds</p>
            </title>
            <aug>
               <au>
                  <snm>Wodak</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Rooman</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>1993</pubdate>
            <volume>3</volume>
            <fpage>247</fpage>
            <lpage>259</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1016/S0959-440X(05)80160-5</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Knowledge based potentials for proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Sippl</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>1995</pubdate>
            <volume>5</volume>
            <fpage>229</fpage>
            <lpage>235</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0959-440X(95)80081-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">7648326</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge based prediction of local structures in globular proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Sippl</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1990</pubdate>
            <volume>213</volume>
            <fpage>859</fpage>
            <lpage>883</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0022-2836(05)80269-4</pubid>
                  <pubid idtype="pmpid">2359125</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Method to identify protein sequences that fold into known three-dimensional structures</p>
            </title>
            <aug>
               <au>
                  <snm>Bowie</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>L&#252;thy</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Eisenberg</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1991</pubdate>
            <volume>253</volume>
            <fpage>164</fpage>
            <lpage>170</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1853201</pubid>
                  <pubid idtype="pmpid" link="fulltext">1853201</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>A new approach to protein fold recognition</p>
            </title>
            <aug>
               <au>
                  <snm>Jones</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1992</pubdate>
            <volume>258</volume>
            <fpage>86</fpage>
            <lpage>89</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1038/358086a0</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>An empirical energy function for threading protein sequence through folding motif</p>
            </title>
            <aug>
               <au>
                  <snm>Bryant</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1993</pubdate>
            <volume>16</volume>
            <fpage>92</fpage>
            <lpage>112</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.340160110</pubid>
                  <pubid idtype="pmpid">8497488</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>How to derive a protein folding potential? A new approach to an old problem</p>
            </title>
            <aug>
               <au>
                  <snm>Mirny</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Shakhnovich</snm>
                  <fnm>EI</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1996</pubdate>
            <volume>264</volume>
            <fpage>1164</fpage>
            <lpage>1179</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1996.0704</pubid>
                  <pubid idtype="pmpid" link="fulltext">9000638</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Energy functions that discriminate X-ray and near-native folds from well-constructed decoys</p>
            </title>
            <aug>
               <au>
                  <snm>Park</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Levitt</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1996</pubdate>
            <volume>258</volume>
            <fpage>367</fpage>
            <lpage>392</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1996.0256</pubid>
                  <pubid idtype="pmpid" link="fulltext">8627632</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Samudrala</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Moult</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>275</volume>
            <fpage>895</fpage>
            <lpage>916</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1997.1479</pubid>
                  <pubid idtype="pmpid" link="fulltext">9480776</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>A distance-dependent atomic knowledge-based potential for improved protein structure selection</p>
            </title>
            <aug>
               <au>
                  <snm>Lu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Skolnick</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2001</pubdate>
            <volume>44</volume>
            <fpage>223</fpage>
            <lpage>232</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.1087</pubid>
                  <pubid idtype="pmpid" link="fulltext">11455595</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Distinguishing native conformations of proteins from decoys with an effective free energy estimator based on the OPLS all-atom force field and the surface generalized Born solvent model</p>
            </title>
            <aug>
               <au>
                  <snm>Felts</snm>
                  <fnm>AK</fnm>
               </au>
               <au>
                  <snm>Gallicchio</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Wallqvist</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Levy</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2002</pubdate>
            <volume>48</volume>
            <fpage>404</fpage>
            <lpage>422</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.10171</pubid>
                  <pubid idtype="pmpid" link="fulltext">12112706</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>An improved protein decoy set for testing energy functions for protein structure prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Tsai</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bonneau</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Morozov</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Kuhlman</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Rohl</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2003</pubdate>
            <volume>53</volume>
            <fpage>76</fpage>
            <lpage>87</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.10454</pubid>
                  <pubid idtype="pmpid" link="fulltext">12945051</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Evaluation of protein models by atomic solvation preference</p>
            </title>
            <aug>
               <au>
                  <snm>Holm</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Sander</snm>
                  <fnm>CJ</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1992</pubdate>
            <volume>225</volume>
            <fpage>93</fpage>
            <lpage>105</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(92)91028-N</pubid>
                  <pubid idtype="pmpid" link="fulltext">1583696</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p><it>Ab initio </it>protein structure prediction of CASP III targets using ROSETTA</p>
            </title>
            <aug>
               <au>
                  <snm>Simons</snm>
                  <fnm>KT</fnm>
               </au>
               <au>
                  <snm>Bonneau</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ruczinski</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1999</pubdate>
            <volume>S3</volume>
            <fpage>171</fpage>
            <lpage>176</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1002/(SICI)1097-0134(1999)37:3+&lt;171::AID-PROT21>3.0.CO;2-Z</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Decoys 'R' Us; A database of incorrect conformations to improved protein structure prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Samudrala</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Levitt</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>2000</pubdate>
            <volume>9</volume>
            <fpage>1399</fpage>
            <lpage>1401</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1110/ps.9.7.1399</pubid>
                  <pubid idtype="pmcid">2144680</pubid>
                  <pubid idtype="pmpid" link="fulltext">10933507</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Critical assessment of methods of protein structure prediction-Round VII</p>
            </title>
            <aug>
               <au>
                  <snm>Moult</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Fidelis</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kryshtafovych</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rost</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Tramontano</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2007</pubdate>
            <volume>69</volume>
            <issue>Suppl 8</issue>
            <fpage>3</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.21767</pubid>
                  <pubid idtype="pmcid">2653632,2653632</pubid>
                  <pubid idtype="pmpid" link="fulltext">17918729</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Servers for protein structure prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Fischer</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>2006</pubdate>
            <volume>6</volume>
            <fpage>178</fpage>
            <lpage>182</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1016/j.sbi.2006.03.004</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>QMEAN: A comprehensive scoring function for model quality assessment</p>
            </title>
            <aug>
               <au>
                  <snm>Benkert</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Tosatto</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Schomburg</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2008</pubdate>
            <volume>71</volume>
            <fpage>261</fpage>
            <lpage>77</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.21715</pubid>
                  <pubid idtype="pmpid" link="fulltext">17932912</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Structure-derived potentials and protein simulations</p>
            </title>
            <aug>
               <au>
                  <snm>Jernigan</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Bahar</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>1996</pubdate>
            <volume>6</volume>
            <fpage>195</fpage>
            <lpage>209</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-440X(96)80075-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">8728652</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Crystal Structure Analysis. A Primer</p>
            </title>
            <aug>
               <au>
                  <snm>Glusker</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Trueblood</snm>
                  <fnm>KN</fnm>
               </au>
            </aug>
            <publisher>Oxford University Press, New York</publisher>
            <pubdate>1985</pubdate>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Accessibility and partner number of protein residues, their relationship and a webserver, ContPlot for their display</p>
            </title>
            <aug>
               <au>
                  <snm>Pal</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bahadur</snm>
                  <fnm>RP</fnm>
               </au>
               <au>
                  <snm>Ray</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Chakrabarti</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2009</pubdate>
            <volume>10</volume>
            <fpage>103</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/1471-2105-10-103</pubid>
                  <pubid idtype="pmcid">2680847</pubid>
                  <pubid idtype="pmpid" link="fulltext">19356223</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Quantifying the accessible surface area of protein residues in their local environment</p>
            </title>
            <aug>
               <au>
                  <snm>Samanta</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Bahadur</snm>
                  <fnm>RP</fnm>
               </au>
               <au>
                  <snm>Chakrabarti</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Protein Eng</source>
            <pubdate>2002</pubdate>
            <volume>15</volume>
            <fpage>659</fpage>
            <lpage>667</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/protein/15.8.659</pubid>
                  <pubid idtype="pmpid" link="fulltext">12364580</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Assessing the role of tryptophan residues in the binding site</p>
            </title>
            <aug>
               <au>
                  <snm>Samanta</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Chakrabarti</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Protein Eng</source>
            <pubdate>2001</pubdate>
            <volume>14</volume>
            <fpage>7</fpage>
            <lpage>15</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/protein/14.1.7</pubid>
                  <pubid idtype="pmpid" link="fulltext">11287674</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Cavities and atomic packing in protein structures and interfaces</p>
            </title>
            <aug>
               <au>
                  <snm>Sonavane</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chakrabarti</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>PLoS Comput Biol</source>
            <pubdate>2008</pubdate>
            <volume>4</volume>
            <issue>9</issue>
            <fpage>e1000188</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1371/journal.pcbi.1000188</pubid>
                  <pubid idtype="pmcid">2582456</pubid>
                  <pubid idtype="pmpid" link="fulltext">19005575</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>An analysis of protein folding pathways</p>
            </title>
            <aug>
               <au>
                  <snm>Moult</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Unger</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1991</pubdate>
            <volume>30</volume>
            <fpage>3816</fpage>
            <lpage>3824</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bi00230a003</pubid>
                  <pubid idtype="pmpid">2018757</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>A critical assessment of comparative molecular modeling of tertiary structures of proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Mosimann</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Meleshko</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>James</snm>
                  <fnm>MN</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1995</pubdate>
            <volume>23</volume>
            <fpage>301</fpage>
            <lpage>317</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.340230305</pubid>
                  <pubid idtype="pmpid">8710824</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>PROSTAR: The protein potential test site. </p>
            </title>
            <aug>
               <au>
                  <snm>Braxenthaler</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Samudrala</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Pedersen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Luo</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Milash</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Moult</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <pubdate>1997</pubdate>
            <url>http://dd.compbio.washington.edu/download.shtml</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">9408939</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Molecular dynamics study of the structure and dynamics of a protein molecule in a crystalline ionic environment, Streptomyces griseus protease A</p>
            </title>
            <aug>
               <au>
                  <snm>Avbelj</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Moult</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kitson</snm>
                  <fnm>DH</fnm>
               </au>
               <au>
                  <snm>James</snm>
                  <fnm>MN</fnm>
               </au>
               <au>
                  <snm>Hagler</snm>
                  <fnm>AT</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1990</pubdate>
            <volume>29</volume>
            <fpage>8658</fpage>
            <lpage>8676</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bi00489a023</pubid>
                  <pubid idtype="pmpid">2125469</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Derivation of protein-specific pair potentials based on weak sequence fragment similarity</p>
            </title>
            <aug>
               <au>
                  <snm>Skolnick</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kolinski</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ortiz</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2000</pubdate>
            <volume>38</volume>
            <fpage>3</fpage>
            <lpage>16</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/(SICI)1097-0134(20000101)38:1&lt;3::AID-PROT2>3.0.CO;2-S</pubid>
                  <pubid idtype="pmpid" link="fulltext">10651034</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Potential energy function and parameters for simulation of the molecular dynamics of proteins and nucleic acids in solutions</p>
            </title>
            <aug>
               <au>
                  <snm>Levitt</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hirshberg</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sharon</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Daggett</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Comput Phys Commun</source>
            <pubdate>1995</pubdate>
            <volume>91</volume>
            <fpage>215</fpage>
            <lpage>231</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1016/0010-4655(95)00049-L</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Sequence-specific <sup>1</sup>H NMR assignments and solution structure of bovine pancreatic polypeptide</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Sutcliffe</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Dobson</snm>
                  <fnm>CM</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1992</pubdate>
            <volume>31</volume>
            <fpage>1245</fpage>
            <lpage>1253</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bi00119a038</pubid>
                  <pubid idtype="pmpid">1734969</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Crystallographic refinement and atomic models of a human Fc fragment and its complex with fragment B of protein A from <it>Staphylococcus aureus </it>at 2.9&#197; and 2.8 &#197; resolution</p>
            </title>
            <aug>
               <au>
                  <snm>Deisenhofer</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1981</pubdate>
            <volume>20</volume>
            <fpage>2361</fpage>
            <lpage>2370</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bi00512a001</pubid>
                  <pubid idtype="pmpid">7236608</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions</p>
            </title>
            <aug>
               <au>
                  <snm>Simons</snm>
                  <fnm>KT</fnm>
               </au>
               <au>
                  <snm>Kooperberg</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1997</pubdate>
            <volume>268</volume>
            <fpage>209</fpage>
            <lpage>225</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1997.0959</pubid>
                  <pubid idtype="pmpid" link="fulltext">9149153</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Accurate modeling of protein conformation by automatic segment matching</p>
            </title>
            <aug>
               <au>
                  <snm>Levitt</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1992</pubdate>
            <volume>226</volume>
            <fpage>507</fpage>
            <lpage>533</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(92)90964-L</pubid>
                  <pubid idtype="pmpid" link="fulltext">1640463</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Structure-derived hydrophobic potential. Hydrophobic potential derived from X-ray structures of globular proteins is able to identify native folds</p>
            </title>
            <aug>
               <au>
                  <snm>Casari</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Sippl</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1992</pubdate>
            <volume>224</volume>
            <fpage>725</fpage>
            <lpage>732</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(92)90556-Y</pubid>
                  <pubid idtype="pmpid" link="fulltext">1569551</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches</p>
            </title>
            <aug>
               <au>
                  <snm>Kocher</snm>
                  <fnm>J-PA</fnm>
               </au>
               <au>
                  <snm>Rooman</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Wodak</snm>
                  <fnm>SJ</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1994</pubdate>
            <volume>235</volume>
            <fpage>1598</fpage>
            <lpage>1613</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1994.1109</pubid>
                  <pubid idtype="pmpid" link="fulltext">8107094</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Recognizing native folds by the arrangement of hydrophobic and polar residues</p>
            </title>
            <aug>
               <au>
                  <snm>Huang</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Subbiah</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Levitt</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1995</pubdate>
            <volume>252</volume>
            <fpage>709</fpage>
            <lpage>720</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1995.0529</pubid>
                  <pubid idtype="pmpid" link="fulltext">7563083</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Statistical potentials for fold assessment</p>
            </title>
            <aug>
               <au>
                  <snm>Melo</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Sanchez</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sali</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>2002</pubdate>
            <volume>11</volume>
            <fpage>430</fpage>
            <lpage>448</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1110/ps.25502</pubid>
                  <pubid idtype="pmcid">2373452</pubid>
                  <pubid idtype="pmpid" link="fulltext">11790853</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Wiederstein</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sippl</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <issue>35 Web Server</issue>
            <fpage>W407</fpage>
            <lpage>410</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/gkm290</pubid>
                  <pubid idtype="pmcid">1933241</pubid>
                  <pubid idtype="pmpid" link="fulltext">17517781</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>VERIFY3D: assessment of protein models with three-dimensional profiles</p>
            </title>
            <aug>
               <au>
                  <snm>Eisenberg</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>L&#252;thy</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Bowie</snm>
                  <fnm>JU</fnm>
               </au>
            </aug>
            <source>Methods Enzymol</source>
            <pubdate>1997</pubdate>
            <volume>277</volume>
            <fpage>396</fpage>
            <lpage>404</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">full_text</pubid>
                  <pubid idtype="pmpid">9379925</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Novel knowledge-based mean force potential at atomic level</p>
            </title>
            <aug>
               <au>
                  <snm>Melo</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Feytmans</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1997</pubdate>
            <volume>267</volume>
            <fpage>207</fpage>
            <lpage>222</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1996.0868</pubid>
                  <pubid idtype="pmpid" link="fulltext">9096219</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Discrimination of native protein structures using atom-atom contact scoring</p>
            </title>
            <aug>
               <au>
                  <snm>McConkey</snm>
                  <fnm>BJ</fnm>
               </au>
               <au>
                  <snm>Sobolev</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Edelman</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <fpage>3215</fpage>
            <lpage>3220</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.0535768100</pubid>
                  <pubid idtype="pmcid">152272</pubid>
                  <pubid idtype="pmpid" link="fulltext">12631702</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>An atomic environment potential for use in protein structure prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Summa</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Levitt</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>DeGrado</snm>
                  <fnm>WF</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2005</pubdate>
            <volume>352</volume>
            <fpage>986</fpage>
            <lpage>1001</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.jmb.2005.07.054</pubid>
                  <pubid idtype="pmpid" link="fulltext">16126228</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation</p>
            </title>
            <aug>
               <au>
                  <snm>Miyazawa</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Jernigan</snm>
                  <fnm>RL</fnm>
               </au>
            </aug>
            <source>Macromolecules</source>
            <pubdate>1985</pubdate>
            <volume>18</volume>
            <fpage>534</fpage>
            <lpage>552</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1021/ma00145a039</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Distance dependent centroid to centroid force fields using high resolution decoys</p>
            </title>
            <aug>
               <au>
                  <snm>Rajgaria</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>McAllister</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Floudas</snm>
                  <fnm>CA</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2008</pubdate>
            <volume>70</volume>
            <fpage>950</fpage>
            <lpage>970</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.21561</pubid>
                  <pubid idtype="pmpid" link="fulltext">17847088</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>A knowledge-based scoring function based on residue triplets for protein structure prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Ngan</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Inouye</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Samudrala</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Protein Eng Des Sel</source>
            <pubdate>2006</pubdate>
            <volume>19</volume>
            <fpage>187</fpage>
            <lpage>193</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/protein/gzj018</pubid>
                  <pubid idtype="pmpid" link="fulltext">16533801</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Four-body contact potentials derived from two protein datasets to discriminate native structures from decoys</p>
            </title>
            <aug>
               <au>
                  <snm>Feng</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kloczkowski</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jernigan</snm>
                  <fnm>RL</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2007</pubdate>
            <volume>68</volume>
            <fpage>57</fpage>
            <lpage>66</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.21362</pubid>
                  <pubid idtype="pmpid" link="fulltext">17393455</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Who checks the checkers? Four validation tools applied to eight atomic resolution structures</p>
            </title>
            <aug>
               <au>
                  <snm>Wilson</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Butterworth</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Dauter</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Lamzin</snm>
                  <fnm>VS</fnm>
               </au>
               <au>
                  <snm>Walsh</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Wodak</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Pontius</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Richelle</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Vaguine</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sander</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hooft</snm>
                  <fnm>RWW</fnm>
               </au>
               <au>
                  <snm>Vriend</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Laskowski</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>MacArthur</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Dodson</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Murshudov</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Oldfield</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Kaptein</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Rullmann</snm>
                  <fnm>JAC</fnm>
               </au>
            </aug>
            <source>J Mo Biol</source>
            <pubdate>1998</pubdate>
            <volume>276</volume>
            <fpage>417</fpage>
            <lpage>436</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1006/jmbi.1997.1526</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Structural features of azurin at 2.7 &#197; resolution</p>
            </title>
            <aug>
               <au>
                  <snm>Adman</snm>
                  <fnm>ET</fnm>
               </au>
               <au>
                  <snm>Jensen</snm>
                  <fnm>LH</fnm>
               </au>
            </aug>
            <source>Isr J Chem</source>
            <pubdate>1981</pubdate>
            <volume>21</volume>
            <fpage>8</fpage>
            <lpage>12</lpage>
         </bibl>
         <bibl id="B54">
            <title>
               <p>Crystal structure analyses of reduced (Cu<sup>I</sup>) poplar plastocyanin at six pH values</p>
            </title>
            <aug>
               <au>
                  <snm>Guss</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Harrowell</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Murata</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Norris</snm>
                  <fnm>VA</fnm>
               </au>
               <au>
                  <snm>Freeman</snm>
                  <fnm>HC</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1986</pubdate>
            <volume>192</volume>
            <fpage>361</fpage>
            <lpage>387</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(86)90371-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">3560221</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>The Protein Data Bank</p>
            </title>
            <aug>
               <au>
                  <snm>Berman</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Westbrook</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Feng</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Gilliland</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Bhat</snm>
                  <fnm>TN</fnm>
               </au>
               <au>
                  <snm>Weissig</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Shindyalov</snm>
                  <fnm>IN</fnm>
               </au>
               <au>
                  <snm>Bourne</snm>
                  <fnm>PE</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>235</fpage>
            <lpage>242</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/28.1.235</pubid>
                  <pubid idtype="pmcid">102472</pubid>
                  <pubid idtype="pmpid" link="fulltext">10592235</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>NACCESS: program for calculating accessibilities</p>
            </title>
            <aug>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>SJ</fnm>
               </au>
            </aug>
            <source>Department of Biochemistry and Molecular Biology</source>
            <publisher>University College of London</publisher>
            <pubdate>1992</pubdate>
            <url>http://wolf.bms.umist.ac.uk/naccess/</url>
         </bibl>
         <bibl id="B57">
            <title>
               <p>The interpretation of protein structures: estimation of static accessibility</p>
            </title>
            <aug>
               <au>
                  <snm>Lee</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Richards</snm>
                  <fnm>FM</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1971</pubdate>
            <volume>55</volume>
            <fpage>379</fpage>
            <lpage>400</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(71)90324-X</pubid>
                  <pubid idtype="pmpid">5551392</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Geometry of nonbonded interactions involving planar groups in proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Chakrabarti</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bhattacharyya</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Prog Biophys Mol Biol</source>
            <pubdate>2007</pubdate>
            <volume>95</volume>
            <fpage>83</fpage>
            <lpage>137</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.pbiomolbio.2007.03.016</pubid>
                  <pubid idtype="pmpid" link="fulltext">17629549</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>CATH-A hierarchic classification of protein domain structures</p>
            </title>
            <aug>
               <au>
                  <snm>Orengo</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Michie</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>DT</fnm>
               </au>
               <au>
                  <snm>Swindells</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Structure</source>
            <pubdate>1997</pubdate>
            <volume>5</volume>
            <fpage>1093</fpage>
            <lpage>1108</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0969-2126(97)00260-8</pubid>
                  <pubid idtype="pmpid">9309224</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
