<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-9-320</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Predicting protein folding pathways at the mesoscopic level based on native interactions between secondary structure elements</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Yang</snm>
               <fnm>Qingwu</fnm>
               <insr iid="I1"/>
               <email>qingwu-yang@neo.tamu.edu</email>
            </au>
            <au id="A2" ca="yes">
               <snm>Sze</snm>
               <fnm>Sing-Hoi</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>shsze@cs.tamu.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Computer Science, Texas A&amp;M University, College Station, TX 77843, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Biochemistry &amp; Biophysics, Texas A&amp;M University, College Station, TX 77843, USA</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>320</fpage>
         <url>http://www.biomedcentral.com/1471-2105/9/320</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18651953</pubid>
               <pubid idtype="doi">10.1186/1471-2105-9-320</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>05</day>
               <month>4</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>23</day>
               <month>7</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>23</day>
               <month>7</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Yang and Sze; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Since experimental determination of protein folding pathways remains difficult, computational techniques are often used to simulate protein folding. Most current techniques to predict protein folding pathways are computationally intensive and are suitable only for small proteins.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>By assuming that the native structure of a protein is known and representing each intermediate conformation as a collection of fully folded structures in which each of them contains a set of interacting secondary structure elements, we show that it is possible to significantly reduce the conformation space while still being able to predict the most energetically favorable folding pathway of large proteins with hundreds of residues at the mesoscopic level, including the pig muscle phosphoglycerate kinase with 416 residues. The model is detailed enough to distinguish between different folding pathways of structurally very similar proteins, including the streptococcal protein G and the peptostreptococcal protein L. The model is also able to recognize the differences between the folding pathways of protein G and its two structurally similar variants NuG1 and NuG2, which are even harder to distinguish. We show that this strategy can produce accurate predictions on many other proteins with experimentally determined intermediate folding states.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Our technique is efficient enough to predict folding pathways for both large and small proteins at the mesoscopic level. Such a strategy is often the only feasible choice for large proteins. A software program implementing this strategy (SSFold) is available at <url>http://faculty.cs.tamu.edu/shsze/ssfold</url>.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>As early studies revealed that an unfolded protein can fold spontaneously to a three-dimensional structure under suitable environmental conditions <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>, traditional approaches to understanding protein folding have focused on the prediction of the native structure. As more studies showed the existence of intermediates and interaction among residues during the protein folding process <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>, there is substantial interest to understand the time order of events during the formation of the tertiary structure. From the free energy point of view, each conformation of a protein is associated with a free energy and the protein folds from the high-energy denatured conformation to its folded structure along a funnel-like energy landscape <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>.</p>
         <p>Although advances in experimental techniques allow the investigation of protein folding pathways at the microsecond timescale <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>, experimental determination of protein folding pathways remains difficult. Most studies are only able to identify general characteristics of the folding pathway without much details and are limited to analyzing small proteins. Computational techniques are often used to simulate protein folding and the problem is transformed to energetic optimization problems, that is, computational search for global energy minimum over all possible conformations. The most accurate computational techniques utilize molecular dynamics to determine the order of events that lead to the tertiary structure through atomic-level simulations <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>. Due to the extremely large conformation space, these approaches suffer from well-known problems accompanying high dimensionality, including computational expensiveness and ease of trapping in local minima, and are applicable only to small proteins in a short time course.</p>
         <p>By omitting some details, proteins can be represented at the level of amino acids. Kolinski and Skolnick <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> performed Monte Carlo simulations of protein folding on a reduced lattice representation of the protein <it>&#945;</it>-carbon backbone. Yue and Dill <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> limited the conformation space to a discrete subset of possibilities and used a branch-and-bound procedure to search for near-optimal conformations. Alm and Baker <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> and Mu&#241;oz and Eaton <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> further observed that the availability of the known native structure can dramatically reduce the search space. Alm and Baker <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> took into account only native interactions among residues and used a sequential binary collision model to predict protein folding mechanisms from the perspectives of free energy landscapes, while Mu&#241;oz and Eaton <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> used a slightly different approach of employing distinct free energy costs for different secondary structures. Amato and Song <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> represented a protein by the torsional angles of its residues and used the probabilistic roadmap technique with a biased sampling strategy around the native structure to predict folding pathways and secondary structure formation order. Liwo et al <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> and Kmiecik and Kolinski <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr></abbrgrp> showed that the use of reduced models of proteins is highly successful in characterizing folding pathways for small proteins at the mesoscopic level. Although these techniques are able to predict folding pathways very accurately for proteins with up to about 100 residues, the majority of proteins in the Protein Data Bank (PDB) <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> are much larger (Figure <figr fid="F1">1</figr>).</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>The distribution of the number of atoms, the number of amino acid residues, and the number of secondary structure elements among 32237 protein structures in the Protein Data Bank (PDB) <abbrgrp><abbr bid="B21">21</abbr></abbrgrp></p>
            </caption>
            <text>
               <p><b>The distribution of the number of atoms, the number of amino acid residues, and the number of secondary structure elements among 32237 protein structures in the Protein Data Bank (PDB) </b><abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. Each bar (except the rightmost one in each chart) shows the number of proteins that have values falling between the indicated lower and upper limits. The rightmost bar in each chart shows the number of proteins that have values of at least the indicated lower limit.</p>
            </text>
            <graphic file="1471-2105-9-320-1"/>
         </fig>
         <p>The problem with representing a protein at the amino acid level is that even with the assumption that each residue has only two states (ordered or disordered), a protein with <it>n </it>residues still has 2<sup><it>n </it></sup>possible conformations <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. To overcome this problem, several recent approaches represent a protein at the level of secondary structure elements (SSEs), in which each element corresponds to one helix or one <it>&#946;</it>-strand. By adopting the framework model in which secondary structures are thought to fold relatively independently of the tertiary structure <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>, each SSE is treated as an indivisible unit that interacts with other SSEs as a whole. Since the number of SSEs in a protein is small (Figure <figr fid="F1">1</figr>), this model is much more tractable to simulate. Eyrich et al <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> assumed that the SSEs are fixed and used a branch-and-bound algorithm to search for near-optimal tertiary structures. Apaydin et al <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> assumed that each SSE of a protein is already in native conformation and moves as a unit, and used the probabilistic roadmap approach to predict folding pathways. Zaki et al <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> proposed an algorithm to predict unfolding pathways based on applying a minimum cut procedure to a weighted graph that represents a protein's contact map or interaction strength between SSEs. Although the underlying assumption that intermediate secondary structures are fully folded before the formation of tertiary structures is not satisfied for most proteins, these studies show that such a strategy is sufficient to study protein folding pathways at the mesoscopic level.</p>
         <p>In this paper, our goal is to further reduce the conformation space without sacrificing prediction accuracy. This is achieved by assuming that SSEs that do not yet interact with each other are independent and can be treated separately. A conformation is represented by a collection of fully folded structures in which each of them contains a set of interacting SSEs. By using a steepest descent strategy, we show that it is possible to predict the most energetically favorable folding pathway of large proteins with hundreds of residues at the mesoscopic level and this model is detailed enough to distinguish between different folding pathways of structurally very similar proteins. In difference from the technique in <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, we do not consider the spatially moving process before the SSEs form native contacts, and thus we are able to achieve much better computational efficiency.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>Assume that the native structure of a protein is known. The protein folding pathway prediction problem is to find an ordered sequence of intermediate conformations to fill the gap between the unfolded state and the native tertiary structure. At the secondary structure level, a protein can be viewed as an ordered sequence of secondary structure elements (SSEs) interspersed with irregular turns or loops, where each SSE is either a helix or a <it>&#946;</it>-strand, and each <it>&#946;</it>-sheet consists of a variable number of <it>&#946;</it>-strands that are not necessarily consecutive on the primary sequence. We represent each protein by <it>t</it><sub>0</sub><it>s</it><sub>1</sub><it>t</it><sub>1</sub>&#8943;<it>s</it><sub><it>k</it></sub><it>t</it><sub><it>k</it></sub>, where <it>k </it>is the number of SSEs, <it>s</it><sub><it>i </it></sub>denotes the <it>i</it>th SSE, <it>t</it><sub><it>j </it></sub>denotes the <it>j</it>th turn, and these elements are in the same order as they appear on the primary sequence. Given the three-dimensional structure of a protein, the assignment of SSEs can be obtained directly from the Protein Data Bank (PDB) <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> or using programs such as DSSP <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>.</p>
         <p>Following <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> and <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, we consider each SSE as an indivisible unit that folds independently of the others according to the contacts present in the native structure. This is based on the framework model that assumes that extensive intermediate secondary structures exist before they are assembled into the tertiary structure <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>, and our goal is to predict the interaction order of SSEs during folding. Based on the observation in <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> and <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> that a model using only native interactions can explain most experimental results, we assume that the interactions between SSEs or turns are the same as the ones present in the native structure. Although these assumptions are often not satisfied as there are many proteins in which there are no clear secondary structures before the formation of tertiary structures or there are no clear preservations of secondary structures throughout folding, such a strategy is sufficient for studying folding pathways at the mesoscopic level and is often the only feasible choice for large proteins.</p>
         <p>We represent a conformation of a protein on the folding pathway by <it>C </it>= {<it>S</it><sub>1</sub>, ..., <it>S</it><sub><it>k</it></sub>}, where each <it>S</it><sub><it>i </it></sub>represents a structure consisting of a set of fully folded SSEs and there are no interactions between two different sets <it>S</it><sub><it>j </it></sub>and <it>S</it><sub><it>j</it>' </sub>(see Figure <figr fid="F2">2</figr> for an illustration). Since our focus is on the SSEs, turns are not included in the conformation but will be utilized when computing energies (see below). The protein folding problem is transformed to identifying a sequence of conformational changes that start from an initial state with fully folded SSEs but no interactions between SSEs through some intermediate conformations and ending in the native structure (Figure <figr fid="F2">2</figr>). Each conformational change corresponds to finding a new pair of interactions that merges two smaller structures of SSEs into a bigger one. Figure <figr fid="F2">2</figr> illustrates the folding pathway prediction on the B1 domain of the streptococcal protein G (GB1). In the prediction, <it>&#946;</it><sub>3 </sub>and <it>&#946;</it><sub>4 </sub>interact first, then <it>&#945;</it><sub>1 </sub>is added, followed by <it>&#946;</it><sub>1 </sub>and <it>&#946;</it><sub>2</sub>.</p>
         <fig id="F2">
            <title>
               <p>Figure 2</p>
            </title>
            <caption>
               <p>Illustration of the folding pathway prediction for GB1</p>
            </caption>
            <text>
               <p><b>Illustration of the folding pathway prediction for GB1</b>. The starting conformation {{<it>&#946;</it><sub>1</sub>}, {<it>&#946;</it><sub>2</sub>}, {<it>&#945;</it><sub>1</sub>}, {<it>&#946;</it><sub>3</sub>}, {<it>&#946;</it><sub>4</sub>}} corresponds to the initial state. There are three intermediate conformations in the predicted folding pathway, including {{<it>&#946;</it><sub>1</sub>}, {<it>&#946;</it><sub>2</sub>}, {<it>&#945;</it><sub>1</sub>}, {<it>&#946;</it><sub>3</sub>, <it>&#946;</it><sub>4</sub>}}, {{<it>&#946;</it><sub>1</sub>}, {<it>&#946;</it><sub>2</sub>}, {<it>&#945;</it><sub>1</sub>, <it>&#946;</it><sub>3</sub>, <it>&#946;</it><sub>4</sub>}}, and {{<it>&#946;</it><sub>2</sub>}, {<it>&#946;</it><sub>1</sub>, <it>&#945;</it><sub>1</sub>, <it>&#946;</it><sub>3</sub>, <it>&#946;</it><sub>4</sub>}}. The ending conformation {{<it>&#946;</it><sub>1</sub>, <it>&#946;</it><sub>2</sub>, <it>&#945;</it><sub>1</sub>, <it>&#946;</it><sub>3</sub>, <it>&#946;</it><sub>4</sub>}} corresponds to the native state.</p>
            </text>
            <graphic file="1471-2105-9-320-2"/>
         </fig>
         <p>Folding pathway predictions are obtained through the computation of free energies of intermediate conformations. For an intermediate conformation <it>C </it>= {<it>S</it><sub>1</sub>, ..., <it>S</it><sub><it>k</it></sub>}, the free energy <it>E</it>(<it>C</it>) of <it>C </it>is defined as:</p>
         <p>
            <display-formula>
               <m:math name="1471-2105-9-320-i1" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>E</m:mi>
                        <m:mo stretchy="false">(</m:mo>
                        <m:mi>C</m:mi>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>=</m:mo>
                        <m:mstyle displaystyle="true">
                           <m:munderover>
                              <m:mo>&#8721;</m:mo>
                              <m:mrow>
                                 <m:mi>i</m:mi>
                                 <m:mo>=</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                              <m:mi>k</m:mi>
                           </m:munderover>
                           <m:mrow>
                              <m:mi>E</m:mi>
                              <m:mo stretchy="false">(</m:mo>
                              <m:msub>
                                 <m:mi>S</m:mi>
                                 <m:mi>i</m:mi>
                              </m:msub>
                              <m:mo stretchy="false">)</m:mo>
                           </m:mrow>
                        </m:mstyle>
                        <m:mo>,</m:mo>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyrauKaeiikaGIaem4qamKaeiykaKIaeyypa0ZaaabCaeaacqWGfbqrcqGGOaakcqWGtbWudaWgaaWcbaGaemyAaKgabeaakiabcMcaPaWcbaGaemyAaKMaeyypa0JaeGymaedabaGaem4AaSganiabggHiLdGccqGGSaalaaa@3E5E@</m:annotation>
                  </m:semantics>
               </m:math>
            </display-formula>
         </p>
         <p>where each <it>S</it><sub><it>i </it></sub>is viewed as an isolated entity and each <it>E</it>(<it>S</it><sub><it>i</it></sub>) is obtained separately by extracting the three-dimensional coordinates of its residues from the Protein Data Bank (PDB) <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> and using the Rosetta software <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> to compute its free energy. The original Rosetta energy function is used, which is obtained by representing each side chain by a centroid that is located at the center of mass, and computing a weighted sum of the binned probability descriptions of multiple effects, including the solvation and electrostatic effects based on observed distributions in known protein structures, the secondary structure packing effects that include strand pairing, strand arrangement into sheets and helix-strand packing, and the effects of steric repulsion and Van der Waals interactions (more details are available in <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> and in Table I of <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>). To take the backbone into consideration, a turn <it>t</it><sub><it>j </it></sub>is included in the computation of <it>E</it>(<it>S</it><sub><it>i</it></sub>) if both of its adjacent SSEs <it>s</it><sub><it>j </it></sub>(if it exists) and <it>s</it><sub><it>j</it>+1 </sub>(if it exists) are included in <it>S</it><sub><it>i</it></sub>.</p>
         <p>Since the interactions that favor folding usually decrease the free energy while the interactions that destabilize the native structure increase the free energy, our goal is to find the most energetically favorable folding pathway by identifying the conformational change that decreases the free energy the most in each step so that the protein can get to lower energy states as quickly as possible. Figure <figr fid="F3">3</figr> illustrates our SSFold algorithm that uses a steepest descent strategy to choose a new pair of interactions that leads to a conformation with the lowest free energy in each iteration. This procedure is very efficient since only <it>k </it>- 1 iterations are needed. Within each iteration, <it>O</it>(<it>k</it><sup>2</sup>) comparisons are needed to find the best pair of interactions that results in the lowest free energy. This leads to an overall time complexity of <it>O</it>(<it>k</it><sup>3</sup><it>t</it>), where <it>k </it>is the number of SSEs in a protein and <it>t </it>is the time to compute the free energy of a potentially partial protein that contains only some of the SSEs and turns.</p>
         <fig id="F3">
            <title>
               <p>Figure 3</p>
            </title>
            <caption>
               <p>Algorithm SSFold to predict the most energetically favorable interaction order of SSEs that corresponds to a folding pathway</p>
            </caption>
            <text>
               <p><b>Algorithm SSFold to predict the most energetically favorable interaction order of SSEs that corresponds to a folding pathway</b>. Each iteration corresponds to a conformational change that results from a new pair of interactions. Within a folded structure, a turn is included in the energy computations only when adjacent SSEs are included in the structure.</p>
            </text>
            <graphic file="1471-2105-9-320-3"/>
         </fig>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <p>We test our strategy on proteins from the Protein Data Bank (PDB) <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> that have known intermediate folding states from experimental data. We illustrate that our model is detailed enough to distinguish between subtle differences in the folding pathways of the streptococcal protein G, the peptostreptococcal protein L, and variants NuG1 and NuG2 of protein G, which are all structurally very similar proteins. We demonstrate that our approach is applicable to large proteins with hundreds of residues by testing it on the 416 residue pig muscle phosphoglycerate kinase (PGK). We further test it on proteins studied in <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> and <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> to validate that our model has very good accuracy.</p>
         <sec>
            <st>
               <p>Proteins GB1, LB1, NuG1 and NuG2</p>
            </st>
            <p>The 56 residue B1 immunoglobulin binding domain of streptococcal protein G (GB1, PDB: 1GB1) and the 62 residue B1 immunoglobulin binding domain of peptostreptococcal protein L (LB1, PDB: 2PTL) have been used extensively as model systems for studying protein folding mechanisms <abbrgrp><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>. Both GB1 (see Figure <figr fid="F2">2</figr>) and LB1 consist of one <it>&#946;</it>-sheet with four strands and one <it>&#945;</it>-helix. Strands 1 and 2 form an N-terminal <it>&#946;</it>-hairpin, while strands 3 and 4 form a C-terminal <it>&#946;</it>-hairpin. Although GB1 and LB1 have very similar tertiary structures, they have different folding pathways. As suggested by <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, a detailed model is needed to distinguish between them.</p>
            <p>Figure <figr fid="F4">4</figr> shows our folding pathway predictions for GB1 and LB1 (see also Figure <figr fid="F2">2</figr> for GB1).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Folding pathway predictions for GB1, LB1, NuG1 and NuG2</p>
               </caption>
               <text>
                  <p><b>Folding pathway predictions for GB1, LB1, NuG1 and NuG2</b>. Each internal node represents a new pair of interactions and nodes that are higher in the tree indicate earlier interactions. Also compare to Figure 2 for GB1.</p>
               </text>
               <graphic file="1471-2105-9-320-4"/>
            </fig>
            <p>Experimental results showed that the C-terminal <it>&#946;</it>-hairpin in GB1 is formed in the transition state of the folding pathway and serves as the starting point on which the rest of the protein can fold <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. Similar results were obtained using the diffusion-collision model <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. Our prediction is consistent with these results. In contrast, experimental results showed that only the N-terminal <it>&#946;</it>-hairpin in LB1 is mainly formed in the transition state and non-random structures can be detected in the region <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B39">39</abbr></abbrgrp>. Our algorithm also predicts that the N-terminal <it>&#946;</it>-hairpin forms earlier than the C-terminal <it>&#946;</it>-hairpin in LB1.</p>
            <p>Two protein G variants, NuG1 (PDB: 1MHX) and NuG2 (PDB: 1MI0), were designed to have a different folding mechanism from protein G by replacing some residues of protein G <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. In NuG1 and NuG2, the stability of the N-terminal <it>&#946;</it>-hairpin is enhanced while the stability of the C-terminal <it>&#946;</it>-hairpin is reduced, with the N-terminal <it>&#946;</it>-hairpin forming contacts earlier than the C-terminal <it>&#946;</it>-hairpin in both cases <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>.</p>
            <p>Thomas et al <abbrgrp><abbr bid="B40">40</abbr></abbrgrp> showed that it is more difficult to distinguish between the folding pathways of protein G and its variants NuG1 and NuG2 than to distinguish between the folding pathways of protein G and protein L. In our predictions in Figure <figr fid="F4">4</figr>, NuG1 and NuG2 have the same folding pathway, with the N-terminal <it>&#946;</it>-hairpin folded first. This is consistent with the experimental results in <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> and the predictions in <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>.</p>
            <p>Figure <figr fid="F5">5</figr> shows the free energy profiles of GB1, LB1, NuG1 and NuG2 in our predictions. Our predicted folding pathway of GB1 is a non-frustrated curve, similar to the average macroscopic folding pathway given by <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. When compared to GB1, NuG1 and NuG2 have similar profiles and higher initial free energy, but their native structures have lower free energy and are more stable, which is consistent with the analysis in <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Free energy profiles of GB1, LB1, NuG1 and NuG2 in our predictions</p>
               </caption>
               <text>
                  <p><b>Free energy profiles of GB1, LB1, NuG1 and NuG2 in our predictions</b>. A native contact is defined to be a pair of amino acids that have their <it>&#945;</it>-carbon atoms within 7 &#197; of each other. Each starting point corresponds to the initial state in which each SSE has already completed its native fold independently and there are no interactions between SSEs.</p>
               </text>
               <graphic file="1471-2105-9-320-5"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Pig muscle PGK: a large protein</p>
            </st>
            <p>Phosphoglycerate kinase (PGK) from various organisms has been used as a model system for studying domain-domain interactions of multiple-domain proteins <abbrgrp><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp>. The pig muscle PGK (PDB: 1KF0) <abbrgrp><abbr bid="B43">43</abbr></abbrgrp> is a large two-domain protein with 416 residues, with the N-terminal domain consisting of residues 1 to 155 and the C-terminal domain consisting of residues 156 to 416. There are 21 <it>&#945;</it>-helices and 17 <it>&#946;</it>-strands, which belong to four different <it>&#946;</it>-sheets A, B, C and D, arranged as follows on the primary sequence: <it>&#945;</it><sub>1 </sub><it>&#946;</it><sub>A4 </sub><it>&#945;</it><sub>2 </sub><it>&#945;</it><sub>3 </sub><it>&#946;</it><sub>A3 </sub><it>&#945;</it><sub>4 </sub><it>&#946;</it><sub>A1 </sub><it>&#945;</it><sub>5 </sub><it>&#946;</it><sub>A2 </sub><it>&#945;</it><sub>6 </sub><it>&#946;</it><sub>B1 </sub><it>&#946;</it><sub>B2 </sub><it>&#945;</it><sub>7 </sub><it>&#946;</it><sub>A5 </sub><it>&#945;</it><sub>8 </sub><it>&#945;</it><sub>9 </sub><it>&#946;</it><sub>A6 </sub><it>&#945;</it><sub>10 </sub><it>&#946;</it><sub>C3 </sub><it>&#945;</it><sub>11 </sub><it>&#945;</it><sub>12 </sub><it>&#946;</it><sub>C2 </sub><it>&#945;</it><sub>13 </sub><it>&#945;</it><sub>14 </sub><it>&#945;</it><sub>15 </sub><it>&#946;</it><sub>C1 </sub><it>&#946;</it><sub>D2 </sub><it>&#946;</it><sub>D1 </sub><it>&#946;</it><sub>D3 </sub><it>&#945;</it><sub>16 </sub><it>&#946;</it><sub>C4 </sub><it>&#945;</it><sub>17 </sub><it>&#945;</it><sub>18 </sub><it>&#946;</it><sub>C5 </sub><it>&#945;</it><sub>19 </sub><it>&#946;</it><sub>C6 </sub><it>&#945;</it><sub>20 </sub><it>&#945;</it><sub>21</sub>.</p>
            <p>Figure <figr fid="F6">6</figr> shows our folding pathway prediction for the pig muscle PGK, in which <it>&#946;</it>-sheet D is formed first, followed by the formation of <it>&#946;</it>-sheet C interspersed with <it>&#945;</it>-helices in the C-terminal domain. After most SSEs of the C-terminal domain are formed, the SSEs of the N-terminal domain begin to form, with <it>&#946;</it>-sheet A formed before <it>&#946;</it>-sheet B interspersed with <it>&#945;</it>-helices in the N-terminal domain.</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Folding pathway prediction for the pig muscle PGK</p>
               </caption>
               <text>
                  <p>
                     <b>Folding pathway prediction for the pig muscle PGK.</b>
                  </p>
               </text>
               <graphic file="1471-2105-9-320-6"/>
            </fig>
            <p>Szil&#225;gyi and Vas <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> suggested a sequential domain refolding mechanism for the pig muscle PGK, in which folding of the C-terminal domain is independent of the N-terminal domain and takes place first, and folding of the N-terminal domain starts after most of the C-terminal domain folds. The authors also suggested that an intermediate consists of a folded C-terminal domain and a still unfolded N-terminal domain. Our prediction is consistent with these experimental results.</p>
         </sec>
         <sec>
            <st>
               <p>Other proteins</p>
            </st>
            <p>Figure <figr fid="F7">7</figr> shows folding pathway predictions for various small proteins that have known intermediate folding states from biological experiments. The proteins 1BDD and 2CRT were studied in <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, while the proteins 1BIN, 1MBC, 2CI2 and 6PTI were studied in <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>.</p>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Folding pathway predictions for <it>Staphylococcus aureus</it> protein A domain B (PDB: 1BDD), leghemoglobin A (PDB: 1BIN), myoglobin (PDB: 1MBC), chymotrypsin inhibitor 2 structure 1 (PDB: 1COA), chymotrypsin inhibitor 2 structure 2 (PDB: 2CI2), cardiotoxin III (PDB: 2CRT), and bovine pancreatic trypsin inhibitor BPTI (PDB: 6PTI)</p>
               </caption>
               <text>
                  <p>
                     <b>Folding pathway predictions for <it>Staphylococcus aureus</it> protein A domain B (PDB: 1BDD), leghemoglobin A (PDB: 1BIN), myoglobin (PDB: 1MBC), chymotrypsin inhibitor 2 structure 1 (PDB: 1COA), chymotrypsin inhibitor 2 structure 2 (PDB: 2CI2), cardiotoxin III (PDB: 2CRT), and bovine pancreatic trypsin inhibitor BPTI (PDB: 6PTI).</b>
                  </p>
               </text>
               <graphic file="1471-2105-9-320-7"/>
            </fig>
            <p>The B domain of <it>Staphylococcus aureus </it>protein A (PDB: 1BDD) consists of three <it>&#945;</it>-helices. In our prediction, <it>&#945;</it><sub>2 </sub>and <it>&#945;</it><sub>3 </sub>interact first, then <it>&#945;</it><sub>1 </sub>is added. This is consistent with the result of the out-exchange experiment in <abbrgrp><abbr bid="B46">46</abbr></abbrgrp> and experimental results under high temperature <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>.</p>
            <p>Although two members of the globin protein family, leghemoglobin A (PDB: 1BIN) and myoglobin (PDB: 1MBC), have very low sequence similarity, they both consist of eight <it>&#945;</it>-helices and have very similar tertiary structures. Nishimura et al <abbrgrp><abbr bid="B48">48</abbr></abbrgrp> compared their folding pathways experimentally. For leghemoglobin A, <it>&#945;</it><sub>G</sub>, <it>&#945;</it><sub>H</sub>, and part of <it>&#945;</it><sub>E </sub>form stable structures first, while <it>&#945;</it><sub>A </sub>and <it>&#945;</it><sub>B </sub>form in the later stages of the folding pathway. For myoglobin, <it>&#945;</it><sub>A</sub>, <it>&#945;</it><sub>G </sub>and <it>&#945;</it><sub>H </sub>form stable contacts first. The main difference between the two folding pathways is that <it>&#945;</it><sub>A </sub>and <it>&#945;</it><sub>B </sub>form earlier in the folding pathway of myoglobin than in the folding pathway of leghemoglobin A <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>. Our predictions are able to distinguish between these subtle differences. For leghemoglobin A, <it>&#945;</it><sub>G </sub>and <it>&#945;</it><sub>H </sub>are predicted to interact first, then <it>&#945;</it><sub>E </sub>is added, with <it>&#945;</it><sub>B </sub>and <it>&#945;</it><sub>A </sub>added later. For myoglobin, <it>&#945;</it><sub>G </sub>and <it>&#945;</it><sub>H </sub>are also predicted to interact first, then <it>&#945;</it><sub>A </sub>is added, followed by <it>&#945;</it><sub>E </sub>and <it>&#945;</it><sub>B</sub>.</p>
            <p>There are two crystal structures for chymotrypsin inhibitor 2 (PDB: 1COA and 2CI2). While 2CI2 consists of 83 residues, 1COA is a fragment of 2CI2 from residues 20 to 83. They both consist of one <it>&#945;</it>-helix and four <it>&#946;</it>-strands, which are arranged as <it>&#946;</it><sub>1</sub><it>&#945;</it><sub>1</sub><it>&#946;</it><sub>2</sub><it>&#946;</it><sub>3</sub><it>&#946;</it><sub>4 </sub>in 1COA and <it>&#946;</it><sub>1</sub><it>&#945;</it><sub>1</sub><it>&#946;</it><sub>4</sub><it>&#946;</it><sub>3</sub><it>&#946;</it><sub>2 </sub>in 2CI2. In our predictions, 1COA and 2CI2 have the same folding pathway, with the middle two <it>&#946;</it>-strands interacting first, then the <it>&#945;</it>-helix is added, followed by the C-terminal <it>&#946;</it>-strand, and the N-terminal <it>&#946;</it>-strand is added last. For 1COA, simulation by <abbrgrp><abbr bid="B49">49</abbr></abbrgrp> demonstrated that <it>&#946;</it><sub>2 </sub>and <it>&#946;</it><sub>3 </sub>form contacts first, then <it>&#945;</it><sub>1 </sub>is added to form a folding nucleus. The coalescence of <it>&#946;</it><sub>1 </sub>is the rate-limiting step and is completed at the end of the folding process. This is consistent with the result of the out-exchange experiment in <abbrgrp><abbr bid="B46">46</abbr></abbrgrp> that showed that <it>&#946;</it><sub>2</sub>, <it>&#946;</it><sub>3 </sub>and <it>&#945;</it><sub>1 </sub>form contacts first. Our prediction is consistent with these results.</p>
            <p>The all <it>&#946;</it>-sheet protein cardiotoxin III (PDB: 2CRT) consists of five strands. While <it>&#946;</it><sub>1 </sub>and <it>&#946;</it><sub>2 </sub>form a double-stranded domain, <it>&#946;</it><sub>3</sub>, <it>&#946;</it><sub>4 </sub>and <it>&#946;</it><sub>5 </sub>form a triple-stranded domain. By the amide proton pulse exchange experiment, Sivaraman et al <abbrgrp><abbr bid="B50">50</abbr></abbrgrp> showed that the triple-stranded domain forms earlier than the double-stranded domain during the refolding process. The carbonyl groups in <it>&#946;</it><sub>3 </sub>and the amide groups in <it>&#946;</it><sub>5 </sub>form hydrogen bonding partners, which are important for the formation of a hydrophobic cluster <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. Our prediction is consistent with these results, with <it>&#946;</it><sub>3 </sub>and <it>&#946;</it><sub>5 </sub>interacting first, then <it>&#946;</it><sub>4 </sub>is added to form the triple-stranded domain, followed by <it>&#946;</it><sub>2 </sub>and <it>&#946;</it><sub>1 </sub>in the double-stranded domain.</p>
            <p>Bovine pancreatic trypsin inhibitor BPTI (PDB: 6PTI) is a globular protein with two <it>&#945;</it>-helices and three <it>&#946;</it>-strands, which are arranged as <it>&#945;</it><sub>1</sub><it>&#946;</it><sub>2</sub><it>&#946;</it><sub>1</sub><it>&#946;</it><sub>3</sub><it>&#945;</it><sub>2</sub>. Three disulfide bonds between residues 5 and 55, 14 and 38, and 30 and 51 play an important role in stabilizing the native structure <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>, and their formation order was studied in <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. In our prediction, <it>&#946;</it><sub>1 </sub>and <it>&#946;</it><sub>2 </sub>interact first, then <it>&#945;</it><sub>2 </sub>is added. This brings residues 30 and 51 close together and helps to form the disulfide bond between them. Then <it>&#945;</it><sub>1 </sub>is added and this helps to form the disulfide bond between residues 5 and 55, and 14 and 38. Our prediction that <it>&#946;</it><sub>1 </sub>and <it>&#946;</it><sub>2 </sub>interact earlier than the two <it>&#945;</it>-helices is consistent with the result in <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>While our strategy corresponds most closely to the diffusion-collision model that allows folding to proceed independently in different parts of a protein <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>, it is possible to use a modified strategy for other models. For example, to simulate the nucleation-propagation model <abbrgrp><abbr bid="B55">55</abbr></abbrgrp> or the nucleation-condensation model <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>, in which the existence of a nucleus facilitates further folding, one can iteratively add a SSE that results in the lowest free energy to the nucleus. Since energy computations can still be slow and can take hours, which account for significant amount of computation time in our algorithm, it is also possible to use lower resolution methods to compute energy.</p>
         <p>While our strategy finds the most energetically favorable protein folding pathway, there are evidences that multiple folding pathways exist <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B57">57</abbr></abbrgrp>. The ability to analyze multiple folding pathways will also allow the study of protein misfolding <abbrgrp><abbr bid="B58">58</abbr></abbrgrp>. Our approach can be generalized to study the entire free energy landscape <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> as follows: construct a graph in which each vertex represents a biologically plausible conformation and each edge represents a feasible conformation change, which is similar to the roadmap graph in <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> and <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> and the protein folding network in <abbrgrp><abbr bid="B59">59</abbr></abbrgrp> except that we consider each SSE as an indivisible unit. Various graph-theoretic algorithms can then be used to generate predictions of alternative folding pathways.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>We have shown that our procedure has sufficient accuracy to distinguish between subtle differences and our strategy can be applied to large proteins due to its speed. An important future direction is to consider cooperative folding of secondary structures without too much sacrifice in speed, that is, when folding in one secondary structure affects folding in others.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>QY performed the research and implemented the algorithm. S&#8211;HS supervised the research. All authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work was supported by NSF grant DBI-0624077. We thank Yutu Liu for many helpful discussions and for drawing our attention to the problem.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Are there pathways for protein folding?</p>
            </title>
            <aug>
               <au>
                  <snm>Levinthal</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>J Chim Phys</source>
            <pubdate>1968</pubdate>
            <volume>65</volume>
            <fpage>44</fpage>
            <lpage>45</lpage>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Experimental and theoretical aspects of protein folding</p>
            </title>
            <aug>
               <au>
                  <snm>Anfinsen</snm>
                  <fnm>C B</fnm>
               </au>
               <au>
                  <snm>Scheraga</snm>
                  <fnm>H A</fnm>
               </au>
            </aug>
            <source>Adv Protein Chem</source>
            <pubdate>1975</pubdate>
            <volume>29</volume>
            <fpage>205</fpage>
            <lpage>300</lpage>
            <xrefbib>
               <pubid idtype="pmpid">237413</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Intermediates in the folding reactions of small
  proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Kim</snm>
                  <fnm>P S</fnm>
               </au>
               <au>
                  <snm>Baldwin</snm>
                  <fnm>R L</fnm>
               </au>
            </aug>
            <source>Ann Rev Biochem</source>
            <pubdate>1990</pubdate>
            <volume>59</volume>
            <fpage>631</fpage>
            <lpage>660</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2197986</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Pathways of protein folding</p>
            </title>
            <aug>
               <au>
                  <snm>Matthews</snm>
                  <fnm>C R</fnm>
               </au>
            </aug>
            <source>Ann Rev Biochem</source>
            <pubdate>1993</pubdate>
            <volume>62</volume>
            <fpage>653</fpage>
            <lpage>683</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8352599</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>From Levinthal to pathways to funnels</p>
            </title>
            <aug>
               <au>
                  <snm>Dill</snm>
                  <fnm>K A</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>H S</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>1997</pubdate>
            <volume>4</volume>
            <fpage>10</fpage>
            <lpage>19</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8989315</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Protein folding: the free energy surface</p>
            </title>
            <aug>
               <au>
                  <snm>Gruebele</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>161</fpage>
            <lpage>168</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11959492</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Submillisecond kinetics of protein folding</p>
            </title>
            <aug>
               <au>
                  <snm>Eaton</snm>
                  <fnm>WA</fnm>
               </au>
               <au>
                  <snm>Mu&#241;oz</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>CK</fnm>
               </au>
               <au>
                  <snm>Hofrichter</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>1997</pubdate>
            <volume>7</volume>
            <fpage>10</fpage>
            <lpage>14</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">9032067</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>The folding pathway of a protein at high resolution from
  microseconds to seconds</p>
            </title>
            <aug>
               <au>
                  <snm>N&#246;lting</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Golbik</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Neira</snm>
                  <fnm>J L</fnm>
               </au>
               <au>
                  <snm>Soler Gonzalez</snm>
                  <fnm>A S</fnm>
               </au>
               <au>
                  <snm>Schreiber</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Fersht</snm>
                  <fnm>A R</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1997</pubdate>
            <volume>94</volume>
            <fpage>826</fpage>
            <lpage>830</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">19598</pubid>
                  <pubid idtype="pmpid" link="fulltext">9023341</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Protein folding by restrained energy minimization and molecular
  dynamics</p>
            </title>
            <aug>
               <au>
                  <snm>Levitt</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1983</pubdate>
            <volume>170</volume>
            <fpage>723</fpage>
            <lpage>764</lpage>
            <xrefbib>
               <pubid idtype="pmpid">6195346</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Realistic simulations of native-protein dynamics in solution and
  beyond</p>
            </title>
            <aug>
               <au>
                  <snm>Daggett</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Levitt</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Ann Rev Biophys Biomol Struct</source>
            <pubdate>1993</pubdate>
            <volume>22</volume>
            <fpage>353</fpage>
            <lpage>380</lpage>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Pathways to a protein folding intermediate observed in a
  1-microsecond simulation in aqueous solution</p>
            </title>
            <aug>
               <au>
                  <snm>Duan</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kollman</snm>
                  <fnm>P A</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1998</pubdate>
            <volume>282</volume>
            <fpage>740</fpage>
            <lpage>744</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9784131</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Molecular dynamics simulations of the protein unfolding/folding
  reaction</p>
            </title>
            <aug>
               <au>
                  <snm>Daggett</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Acc Chem Res</source>
            <pubdate>2002</pubdate>
            <volume>35</volume>
            <fpage>422</fpage>
            <lpage>429</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12069627</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Monte Carlo simulations of protein folding. I. Lattice model and
  interaction scheme</p>
            </title>
            <aug>
               <au>
                  <snm>Kolinski</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Skolnick</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1994</pubdate>
            <volume>18</volume>
            <fpage>338</fpage>
            <lpage>352</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8208726</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Folding proteins with a simple energy function and extensive
  conformational searching</p>
            </title>
            <aug>
               <au>
                  <snm>Yue</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dill</snm>
                  <fnm>K A</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1996</pubdate>
            <volume>5</volume>
            <fpage>254</fpage>
            <lpage>261</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2143350</pubid>
                  <pubid idtype="pmpid" link="fulltext">8745403</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Prediction of protein-folding mechanisms from free-energy
  landscapes derived from native structures</p>
            </title>
            <aug>
               <au>
                  <snm>Alm</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <fpage>11305</fpage>
            <lpage>11310</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">18029</pubid>
                  <pubid idtype="pmpid" link="fulltext">10500172</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>A simple model for calculating the kinetics of protein folding from
  three-dimensional structures</p>
            </title>
            <aug>
               <au>
                  <snm>Mu&#241;oz</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Eaton</snm>
                  <fnm>W A</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <fpage>11311</fpage>
            <lpage>11316</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">18030</pubid>
                  <pubid idtype="pmpid" link="fulltext">10500173</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Using motion planning to study protein folding pathways</p>
            </title>
            <aug>
               <au>
                  <snm>Amato</snm>
                  <fnm>N M</fnm>
               </au>
               <au>
                  <snm>Song</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>J Comput Biol</source>
            <pubdate>2002</pubdate>
            <volume>9</volume>
            <fpage>149</fpage>
            <lpage>168</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12015875</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p><it>Ab initio</it> simulations of protein-folding pathways by molecular dynamics with the united-residue model of polypeptide chains</p>
            </title>
            <aug>
               <au>
                  <snm>Liwo</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Khalili</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Scheraga</snm>
                  <fnm>HA</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <fpage>2362</fpage>
            <lpage>2367</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">548970</pubid>
                  <pubid idtype="pmpid" link="fulltext">15677316</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Characterization of protein-folding pathways by reduced-space
  modeling</p>
            </title>
            <aug>
               <au>
                  <snm>Kmiecik</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kolinski</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2007</pubdate>
            <volume>104</volume>
            <fpage>12330</fpage>
            <lpage>12335</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1941469</pubid>
                  <pubid idtype="pmpid" link="fulltext">17636132</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Folding pathway of the B1 domain of protein G explored by
  multiscale modeling</p>
            </title>
            <aug>
               <au>
                  <snm>Kmiecik</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kolinski</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Biophys J</source>
            <pubdate>2008</pubdate>
            <volume>94</volume>
            <fpage>726</fpage>
            <lpage>736</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17890394</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>The Protein Data Bank</p>
            </title>
            <aug>
               <au>
                  <snm>Berman</snm>
                  <fnm>H M</fnm>
               </au>
               <au>
                  <snm>Westbrook</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Feng</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Gilliland</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Bhat</snm>
                  <fnm>T N</fnm>
               </au>
               <au>
                  <snm>Weissig</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Shindyalov</snm>
                  <fnm>I N</fnm>
               </au>
               <au>
                  <snm>Bourne</snm>
                  <fnm>P E</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>235</fpage>
            <lpage>242</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">102472</pubid>
                  <pubid idtype="pmpid" link="fulltext">10592235</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Mechanism of protein folding</p>
            </title>
            <aug>
               <au>
                  <snm>N&#246;lting</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Andert</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2000</pubdate>
            <volume>41</volume>
            <fpage>288</fpage>
            <lpage>298</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">11025541</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Protein tertiary structure prediction using a branch and bound
  algorithm</p>
            </title>
            <aug>
               <au>
                  <snm>Eyrich</snm>
                  <fnm>V A</fnm>
               </au>
               <au>
                  <snm>Standley</snm>
                  <fnm>D M</fnm>
               </au>
               <au>
                  <snm>Felts</snm>
                  <fnm>A K</fnm>
               </au>
               <au>
                  <snm>Friesner</snm>
                  <fnm>R A</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1999</pubdate>
            <volume>35</volume>
            <fpage>41</fpage>
            <lpage>57</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10090285</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Capturing molecular energy landscapes with probabilistic
  conformational roadmaps</p>
            </title>
            <aug>
               <au>
                  <snm>Apaydin</snm>
                  <fnm>M S</fnm>
               </au>
               <au>
                  <snm>Singh</snm>
                  <fnm>A P</fnm>
               </au>
               <au>
                  <snm>Brutlag</snm>
                  <fnm>D L</fnm>
               </au>
               <au>
                  <snm>Latombe</snm>
                  <fnm>J C</fnm>
               </au>
            </aug>
            <source>Proceedings of the IEEE International Conference on Robotics and
  Automation</source>
            <pubdate>2001</pubdate>
            <fpage>932</fpage>
            <lpage>939</lpage>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Predicting protein folding pathways</p>
            </title>
            <aug>
               <au>
                  <snm>Zaki</snm>
                  <fnm>M J</fnm>
               </au>
               <au>
                  <snm>Nadimpally</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Bardhan</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Bystroff</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20 Suppl 1</volume>
            <fpage>386</fpage>
            <lpage>393</lpage>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Dictionary of protein secondary structure: pattern recognition of
  hydrogen-bonded and geometrical features</p>
            </title>
            <aug>
               <au>
                  <snm>Kabsch</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Sander</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Biopolymers</source>
            <pubdate>1983</pubdate>
            <volume>22</volume>
            <fpage>2577</fpage>
            <lpage>2637</lpage>
            <xrefbib>
               <pubid idtype="pmpid">6667333</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Protein structure prediction using Rosetta</p>
            </title>
            <aug>
               <au>
                  <snm>Rohl</snm>
                  <fnm>C A</fnm>
               </au>
               <au>
                  <snm>Strauss</snm>
                  <fnm>C E M</fnm>
               </au>
               <au>
                  <snm>Misura</snm>
                  <fnm>K M S</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Methods Enzymol</source>
            <pubdate>2004</pubdate>
            <volume>383</volume>
            <fpage>66</fpage>
            <lpage>93</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15063647</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Improved recognition of native-like protein structures using a
  combination of sequence-dependent and sequence-independent features of
  proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Simons</snm>
                  <fnm>K T</fnm>
               </au>
               <au>
                  <snm>Ruczinski</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Kooperberg</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Fox</snm>
                  <fnm>B A</fnm>
               </au>
               <au>
                  <snm>Bystroff</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1999</pubdate>
            <volume>34</volume>
            <fpage>82</fpage>
            <lpage>95</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10336385</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>A path planning-based study of protein folding with a case study of
  hairpin formation in protein G and L</p>
            </title>
            <aug>
               <au>
                  <snm>Song</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Dill</snm>
                  <fnm>K A</fnm>
               </au>
               <au>
                  <snm>Scholtz</snm>
                  <fnm>J M</fnm>
               </au>
               <au>
                  <snm>Amato</snm>
                  <fnm>N M</fnm>
               </au>
            </aug>
            <source>Pacific Symposium on Biocomputing</source>
            <pubdate>2003</pubdate>
            <fpage>240</fpage>
            <lpage>251</lpage>
            <xrefbib>
               <pubid idtype="pmpid">12603032</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Kinetic analysis of folding and unfolding the 56 amino acid
  IgG-binding domain of streptococcal protein G</p>
            </title>
            <aug>
               <au>
                  <snm>Alexander</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Orban</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bryan</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1992</pubdate>
            <volume>31</volume>
            <fpage>7243</fpage>
            <lpage>7248</lpage>
            <xrefbib>
               <pubid idtype="pmpid">1510916</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>A short linear peptide that folds into a native stable
&#946;-hairpin in aqueous solution</p>
            </title>
            <aug>
               <au>
                  <snm>Blanco</snm>
                  <fnm>FJ</fnm>
               </au>
               <au>
                  <snm>Rivas</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Serrano</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>1994</pubdate>
            <volume>1</volume>
            <fpage>584</fpage>
            <lpage>590</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid">7634098</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Two crystal structures of the B1 immunoglobulin-binding domain of
  streptococcal protein G and comparison with NMR</p>
            </title>
            <aug>
               <au>
                  <snm>Gallagher</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Alexander</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bryan</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Gilliland</snm>
                  <fnm>G L</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1994</pubdate>
            <volume>33</volume>
            <fpage>4721</fpage>
            <lpage>4729</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8161530</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Folding of protein G B1 domain studied by the conformational
characterization of fragments comprising its secondary structure
elements</p>
            </title>
            <aug>
               <au>
                  <snm>Blanco</snm>
                  <fnm>FJ</fnm>
               </au>
               <au>
                  <snm>Serrano</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Eur J Biochem</source>
            <pubdate>1995</pubdate>
            <volume>230</volume>
            <fpage>634</fpage>
            <lpage>649</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">7607238</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>A breakdown of symmetry in the folding transition state of protein
  L</p>
            </title>
            <aug>
               <au>
                  <snm>Kim</snm>
                  <fnm>D E</fnm>
               </au>
               <au>
                  <snm>Fisher</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2000</pubdate>
            <volume>298</volume>
            <fpage>971</fpage>
            <lpage>984</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10801362</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Critical role of &#946;-hairpin formation in protein G folding</p>
            </title>
            <aug>
               <au>
                  <snm>McCallister</snm>
                  <fnm>EL</fnm>
               </au>
               <au>
                  <snm>Alm</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2000</pubdate>
            <volume>7</volume>
            <fpage>669</fpage>
            <lpage>673</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">10932252</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Computer-based redesign of a protein folding pathway</p>
            </title>
            <aug>
               <au>
                  <snm>Nauli</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kuhlman</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2001</pubdate>
            <volume>8</volume>
            <fpage>602</fpage>
            <lpage>605</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11427890</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>An experimental investigation of conformational fluctuations in
  proteins G and L</p>
            </title>
            <aug>
               <au>
                  <snm>Tunnicliffe</snm>
                  <fnm>R B</fnm>
               </au>
               <au>
                  <snm>Waby</snm>
                  <fnm>J L</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>R J</fnm>
               </au>
               <au>
                  <snm>Williamson</snm>
                  <fnm>M P</fnm>
               </au>
            </aug>
            <source>Structure</source>
            <pubdate>2005</pubdate>
            <volume>13</volume>
            <fpage>1677</fpage>
            <lpage>1684</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16271891</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>The role of sequence and structure in protein folding kinetics: the
  diffusion-collision model applied to proteins L and G</p>
            </title>
            <aug>
               <au>
                  <snm>Islam</snm>
                  <fnm>S A</fnm>
               </au>
               <au>
                  <snm>Karplus</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Weaver</snm>
                  <fnm>D L</fnm>
               </au>
            </aug>
            <source>Structure</source>
            <pubdate>2004</pubdate>
            <volume>12</volume>
            <fpage>1833</fpage>
            <lpage>1845</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15458632</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>NMR characterization of residual structure in the denatured state
  of protein L</p>
            </title>
            <aug>
               <au>
                  <snm>Yi</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Scalley Kim</snm>
                  <fnm>M L</fnm>
               </au>
               <au>
                  <snm>Alm</snm>
                  <fnm>E J</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2000</pubdate>
            <volume>299</volume>
            <fpage>1341</fpage>
            <lpage>1351</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10873457</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Simulating protein motions with rigidity analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Thomas</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Tapia</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Amato</snm>
                  <fnm>N M</fnm>
               </au>
            </aug>
            <source>J Comput Biol</source>
            <pubdate>2007</pubdate>
            <volume>14</volume>
            <fpage>839</fpage>
            <lpage>855</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17691897</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Crystal structures and increased stabilization of the protein G
  variants with switched folding pathways NuG1 and NuG2</p>
            </title>
            <aug>
               <au>
                  <snm>Nauli</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kuhlman</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Le Trong</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Stenkamp</snm>
                  <fnm>R E</fnm>
               </au>
               <au>
                  <snm>Teller</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>2002</pubdate>
            <volume>11</volume>
            <fpage>2924</fpage>
            <lpage>2931</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2373753</pubid>
                  <pubid idtype="pmpid" link="fulltext">12441390</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Domain behavior during the folding of a thermostable
  phosphoglycerate kinase</p>
            </title>
            <aug>
               <au>
                  <snm>Parker</snm>
                  <fnm>M J</fnm>
               </au>
               <au>
                  <snm>Spencer</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Jackson</snm>
                  <fnm>G S</fnm>
               </au>
               <au>
                  <snm>Burston</snm>
                  <fnm>S G</fnm>
               </au>
               <au>
                  <snm>Hosszu</snm>
                  <fnm>L L</fnm>
               </au>
               <au>
                  <snm>Craven</snm>
                  <fnm>C J</fnm>
               </au>
               <au>
                  <snm>Waltho</snm>
                  <fnm>J P</fnm>
               </au>
               <au>
                  <snm>Clarke</snm>
                  <fnm>A R</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1996</pubdate>
            <volume>35</volume>
            <fpage>15740</fpage>
            <lpage>15752</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8961937</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Crystallographic and thiol-reactivity studies on the complex of pig
muscle phosphoglycerate kinase with ATP analogues: correlation between
nucleotide binding mode and helix flexibility</p>
            </title>
            <aug>
               <au>
                  <snm>Kov&#225;ri</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Flachner</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>N&#225;ray Szab&#243;</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Vas</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>2002</pubdate>
            <volume>41</volume>
            <fpage>8796</fpage>
            <lpage>8806</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">12102622</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Asymmetric effect of domain interactions on the kinetics of folding
  in yeast phosphoglycerate kinase</p>
            </title>
            <aug>
               <au>
                  <snm>Osv&#225;th</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>K&#246;hler</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Z&#225;vodszky</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Fidy</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>2005</pubdate>
            <volume>14</volume>
            <fpage>1609</fpage>
            <lpage>1616</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2253372</pubid>
                  <pubid idtype="pmpid" link="fulltext">15883189</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Sequential domain refolding of pig muscle 3-phosphoglycerate
  kinase: kinetic analysis of reactivation</p>
            </title>
            <aug>
               <au>
                  <snm>Szil&#225;gyi</snm>
                  <fnm>A N</fnm>
               </au>
               <au>
                  <snm>Vas</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Fold Des</source>
            <pubdate>1998</pubdate>
            <volume>3</volume>
            <fpage>565</fpage>
            <lpage>575</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid">9889168</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>The hydrogen exchange core and protein folding</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Woodward</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1999</pubdate>
            <volume>8</volume>
            <fpage>1571</fpage>
            <lpage>1590</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2144413</pubid>
                  <pubid idtype="pmpid" link="fulltext">10452602</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Flexibly varying folding mechanism of a nearly symmetrical protein:
  B domain of protein A</p>
            </title>
            <aug>
               <au>
                  <snm>Itoh</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sasai</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2006</pubdate>
            <volume>103</volume>
            <fpage>7298</fpage>
            <lpage>7303</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1564280</pubid>
                  <pubid idtype="pmpid" link="fulltext">16648265</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Conservation of folding pathways in evolutionarily distant globin
  sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Nishimura</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Prytulla</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Dyson</snm>
                  <fnm>H J</fnm>
               </au>
               <au>
                  <snm>Wright</snm>
                  <fnm>P E</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2000</pubdate>
            <volume>7</volume>
            <fpage>679</fpage>
            <lpage>686</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10932254</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>"New view'' of protein folding reconciled with the old through
  multiple unfolding simulations</p>
            </title>
            <aug>
               <au>
                  <snm>Lazaridis</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Karplus</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1997</pubdate>
            <volume>278</volume>
            <fpage>1928</fpage>
            <lpage>1931</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">9395391</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Events in the kinetic folding pathway of a small, all
&#946;-sheet protein</p>
            </title>
            <aug>
               <au>
                  <snm>Sivaraman</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kumar</snm>
                  <fnm>T K</fnm>
               </au>
               <au>
                  <snm>Chang</snm>
                  <fnm>D K</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>W Y</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1998</pubdate>
            <volume>273</volume>
            <fpage>10181</fpage>
            <lpage>10189</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">9553067</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>A kinetic explanation for the rearrangement pathway of BPTI
  folding</p>
            </title>
            <aug>
               <au>
                  <snm>Weissman</snm>
                  <fnm>J S</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>P S</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>1995</pubdate>
            <volume>2</volume>
            <fpage>1123</fpage>
            <lpage>1130</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8846225</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Mutational analysis of the BPTI folding pathway: I. Effects of
aromatic &#10142; leucine substitutions on the distribution
of folding intermediates</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J X</fnm>
               </au>
               <au>
                  <snm>Goldenberg</snm>
                  <fnm>D P</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1997</pubdate>
            <volume>6</volume>
            <fpage>1549</fpage>
            <lpage>1562</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2143733</pubid>
                  <pubid idtype="pmpid" link="fulltext">9232656</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Simulations of the structural and dynamical properties of denatured proteins: the "molten coil'' state of bovine pancreatic trypsin inhibitor</p>
            </title>
            <aug>
               <au>
                  <snm>Kazmirski</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Daggett</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>277</volume>
            <fpage>487</fpage>
            <lpage>506</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">9514766</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>Protein folding dynamics: the diffusion-collision model and
  experimental data</p>
            </title>
            <aug>
               <au>
                  <snm>Karplus</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Weaver</snm>
                  <fnm>D L</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1994</pubdate>
            <volume>3</volume>
            <fpage>650</fpage>
            <lpage>668</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2142854</pubid>
                  <pubid idtype="pmpid" link="fulltext">8003983</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>Specific nucleus as the transition state for protein folding:
  evidence from the lattice model</p>
            </title>
            <aug>
               <au>
                  <snm>Abkevich</snm>
                  <fnm>V I</fnm>
               </au>
               <au>
                  <snm>Gutin</snm>
                  <fnm>A M</fnm>
               </au>
               <au>
                  <snm>Shakhnovich</snm>
                  <fnm>E I</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1994</pubdate>
            <volume>33</volume>
            <fpage>10026</fpage>
            <lpage>10036</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8060971</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Optimization of rates of protein folding: the
  nucleation-condensation mechanism and its implications</p>
            </title>
            <aug>
               <au>
                  <snm>Fersht</snm>
                  <fnm>A R</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1995</pubdate>
            <volume>92</volume>
            <fpage>10869</fpage>
            <lpage>10873</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">40532</pubid>
                  <pubid idtype="pmpid" link="fulltext">7479900</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Different folding transition states may result in the same native
  structure</p>
            </title>
            <aug>
               <au>
                  <snm>Viguera</snm>
                  <fnm>A R</fnm>
               </au>
               <au>
                  <snm>Serrano</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Wilmanns</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>1996</pubdate>
            <volume>3</volume>
            <fpage>874</fpage>
            <lpage>880</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8836105</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Protein folding and misfolding</p>
            </title>
            <aug>
               <au>
                  <snm>Dobson</snm>
                  <fnm>C M</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2003</pubdate>
            <volume>426</volume>
            <fpage>884</fpage>
            <lpage>890</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">14685248</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>The protein folding network</p>
            </title>
            <aug>
               <au>
                  <snm>Rao</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Caflisch</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2004</pubdate>
            <volume>342</volume>
            <fpage>299</fpage>
            <lpage>306</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15313625</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
