<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-9-230</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Software</dochead>
      <bibl>
         <title>
            <p>CPSP-tools &#8211; Exact and complete algorithms for high-throughput 3D lattice protein studies</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Mann</snm>
               <fnm>Martin</fnm>
               <insr iid="I1"/>
               <email>mmann@informatik.uni-freiburg.de</email>
            </au>
            <au id="A2">
               <snm>Will</snm>
               <fnm>Sebastian</fnm>
               <insr iid="I1"/>
               <email>will@informatik.uni-freiburg.de</email>
            </au>
            <au id="A3" ca="yes">
               <snm>Backofen</snm>
               <fnm>Rolf</fnm>
               <insr iid="I1"/>
               <email>backofen@informatik.uni-freiburg.de</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Bioinformatics Group, University of Freiburg, Georges-K&#246;hler-Allee 106, 79110 Freiburg, Germany</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>230</fpage>
         <url>http://www.biomedcentral.com/1471-2105/9/230</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18462492</pubid>
               <pubid idtype="doi">10.1186/1471-2105-9-230</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>18</day>
               <month>12</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>07</day>
               <month>5</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>07</day>
               <month>5</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Mann et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The principles of protein folding and evolution pose problems of very high inherent complexity. Often these problems are tackled using simplified protein models, e.g. lattice proteins. The CPSP-tools package provides programs to solve exactly and completely the problems typical of studies using 3D lattice protein models. Among the tasks addressed are the prediction of (all) globally optimal and/or suboptimal structures as well as sequence design and neutral network exploration.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>In contrast to stochastic approaches, which are not capable of answering many fundamental questions, our methods are based on fast, non-heuristic techniques. The resulting tools are designed for high-throughput studies of 3D-lattice proteins utilising the Hydrophobic-Polar (HP) model. The source bundle is freely available <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>The CPSP-tools package is the first set of exact and complete methods for extensive, high-throughput studies of non-restricted 3D-lattice protein models. In particular, our package deals with cubic and face centered cubic (FCC) lattices.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The organisation of bio-molecules, in particular proteins, in the sequence and structure space has recently been attracting increased attention. Particularly questions concerning finding the native structure or investigating the kinetics and evolution of proteins have been widely studied. These problems are often tackled using simplified models such as the Hydrophobic-Polar (HP) model (e.g. Jacob <it>et al. </it><abbrgrp><abbr bid="B2">2</abbr></abbrgrp>). Though abstract, these models are computationally feasible and do allow for deeper insights into fundamental and general principles <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>.</p>
         <p>Several recurring tasks can be identified in such studies using simplified models. Namely, predicting the native structure, classifying whether a sequence is protein-like, calculating its degeneracy and stability, or the design of sequences that optimally fold to a given structure. The problems associated with these tasks are computationally very hard (NP-complete) <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>. Nevertheless, these tasks demand for exact and complete (i.e. non-heuristic) methods. It is important to note that stochastic methods cannot be used for proving optimality and in particular proving that a sequence has a unique lowest energy (protein-like) fold <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>.</p>
         <p>Consequently, with the exception of Yue and Dill <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, all studies requiring complete and exact answers to optimal structure prediction were based on exhaustive enumeration. These studies were, hence, confined to small sequence lengths. In other approaches, structures are artificially restricted to be maximally compact (e.g. filling a 3 &#215; 3 &#215; 3 cube) <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. This allows for complete enumeration but artificially biases the energy function towards overall hydrophobicity.</p>
         <p>Furthermore, many studies are confined to extremely simplified models on the 2D-square or 3D-diamond-lattice <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B11">11</abbr></abbrgrp>. The coordination number, a measurement of lattice complexity, is four in both cases. The use of lattices with such a low complexity may lead to oversimplified models that are not able to reproduce real world properties. Park and Levitt <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> have shown that lattices with higher coordination number provide a much better fit to real protein structures. A further hint toward the simplicity of the 2D-lattice is the low computational complexity of inverse folding when compared to the 3D-cubic lattice <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. The <it>Constraint-based Protein Structure Prediction (CPSP) </it>approach by Backofen and Will <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> provides a way to overcome the aforementioned obstacles. The method is tailored to the HP model introduced by Lau and Dill <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. This model is widely used in the literature <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>. CPSP supports complex 3D lattices (currently cubic and face centered cubic) without artificial restrictions (e.g. to be maximally compact). The approach predicts all globally optimal structures together with a proof of optimality. No naive, exhaustive enumeration of all structures is performed and it is as fast as stochastic methods that cannot prove optimality. Backofen and Will <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> showed that the CPSP-approach could fold even sequences of length 200 to optimality within seconds. In contrast, exhaustive structure enumeration as e.g. done by Blackburne and Hirst <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> is restricted to short sequence lengths. For instance, on a 3D-cubic lattice it is only viable to enumerate up to about length 20. In fact, the exact number of structures is only known up to length 23 where there are already more than 5 &#215; 10<sup>15 </sup><abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. CPSP uses constraint programming that is commonly applied to hard (NP-complete) problems and, thus, avoids the complete expansion of the whole search space. Hence, constraint-programming techniques are a powerful tool to handle the high complexity that typifies problems related to protein structure. Constraint-programming techniques have successfully been applied to structure prediction with given secondary structure information <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, analysis of NMR data <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, and modeling of protein complexes <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>.</p>
         <p>Currently, we are not aware of any other complete approach that ensures optimality of the predicted structures in different lattices. There is an alternative to CPSP for the 3D-cubic lattice, the constraint-based hydrophobic core construction method by Yue and Dill <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. This allows the prediction of optimal structures and proves their optimality. However, using the CPSP-approach, Backofen and Will showed that the method developed by Yue and Dill is not always complete in enumerating all optimal structures <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
         <sec>
            <st>
               <p>Complex Lattices</p>
            </st>
            <p>As mentioned before, complete structure enumeration is only applicable to simple, low coordination number lattices. In contrast, the CPSP-approach is built for the more complex 3D-cubic and 3D-face-centered-cubic (FCC) lattices with higher coordination numbers of 6 and 12, respectively. A main feature of the CPSP-tools is their applicability to the unrestricted FCC lattice. The FCC lattice lacks one of the main problems of the 3D-cubic lattice, namely that only sequence positions with different parities form contacts; the parity problem <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. Modeling protein structures on a FCC lattice, Park and Levitt <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> demonstrated that a good approximation of real protein structures is possible. They achieved a coordinate root mean square deviation of 1.78 &#197;, whereas a deviation of 2.84 &#197; was obtained in the 3D-cubic lattice. Recently, Bagci <it>et al. </it><abbrgrp><abbr bid="B23">23</abbr></abbrgrp> have shown that the neighborhood of amino acids in proteins closely resembles a distorted FCC lattice, and that the FCC is best suited for modeling proteins. The CPSP-approach is the first exact method that allows the prediction of provable optimal structures in the FCC lattice. An example is given in Figure <figr fid="F1">1</figr>.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Structure in FCC lattice model</p>
               </caption>
               <text>
                  <p><b>Structure in FCC lattice model</b>. One optimal structure of sequence <it>S</it><sub>1 </sub>from Table 2 with 50 HH-contacts in the 3D-face centered cubic (FCC) lattice model. The coloring shows H-monomers in green and P-monomers in grey.</p>
               </text>
               <graphic file="1471-2105-9-230-1"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Implementation</p>
         </st>
         <p>CPSP-tools provides a set of programs that enable typical, modern research tasks to be calculated efficiently and accurately. Here we list the programs each with a typical example application. HPSTRUCT predicts (all) optimal and suboptimal structures as required for investigating properties of low energy conformations, as e.g. studied by Jacob and Unger <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. The statistical analysis of protein-like sequences, see Blackburne and Hirst <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, requires a degeneracy-based classification of sequences. This is possible with HPDEG. For the exploration of protein evolution, similar to Wroe and Chan <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, one needs to investigate the sequence-structure space. We provide HPDESIGN for sequence design and HPNNET for neutral network computation.</p>
         <p>All methods can be applied to HP-sequences in the cubic and the more complex face centered cubic lattice model. Before giving a detailed description of the tools, we first introduce the idea of H-cores, central to these methods.</p>
         <sec>
            <st>
               <p>H-core database</p>
            </st>
            <p>In the HP lattice models, two monomers form a <it>contact </it>if they occupy neighboring positions in the lattice. The <it>energy </it>of a structure is defined by the number of contacts between H-monomers, i.e. <it>HH-contacts</it>. Thus, an optimal (minimum energy) conformation maximizes the number of HH-contacts. An important observation is that optimal structures show an almost optimal (maximally compact) packing of the H-monomers. Such dispersions of H-monomers without any chain connectivity are called <it>H-cores</it>. The compactness of the H-cores is a basic feature that can be used for structure prediction and sequence design. Note that optimal H-cores are independent of a particular sequence and depend only on the number of H-monomers. Hence, compact and nearly compact H-cores can be precalculated and stored in a database. HPSTRUCT and HPDESIGN use this database as a starting point for their calculations (details later). Thereby, redundant computation is avoided, which significantly speeds up the CPSP-approach and related applications.</p>
            <p>The enumeration of all optimal H-cores in complex lattice models such as FCC is a computationally hard problem by itself and was solved by Backofen and Will using constraint-programming techniques <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. Firstly an upper bound on the number of possible contacts for a given number of monomers is calculated via dynamic programming. Subsequently, this information is used to enumerate all compact optimal and almost optimal (<it>suboptimal</it>) H-cores for a given number of H-monomers using constraint-programming. Some statistics on the number of H-cores in the 3D-cubic lattice are given in Fig. <figr fid="F4">4</figr>. It shows that the number of H-cores grows exponentially in H-core size but still much slower than the number of structures for a corresponding sequence length.</p>
         </sec>
         <sec>
            <st>
               <p>HPstruct</p>
            </st>
            <sec>
               <st>
                  <p>Motivation</p>
               </st>
               <p>HPSTRUCT implements the CPSP approach, as introduced by Backofen and Will <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>, to predict provably optimal structures of 3D lattice proteins in the HP-model. For a given HP-sequence <it>S </it>and a given lattice type (cubic or face centered cubic), (all) optimal structures are calculated. The CPSP approach computes the global minimal energy for <it>S</it>.</p>
            </sec>
            <sec>
               <st>
                  <p>Methods</p>
               </st>
               <p>The CPSP-approach is based on the H-core database as described before. For a concrete sequence <it>S </it>the approach systematically examines the list of H-cores compatible with <it>S </it>in decreasing maximal contact number. For each core, it attempts to thread the sequence through the core. Threading means to find a placement of the monomers of <it>S </it>in a self-avoiding walk such that all H-monomers are elements of the given H-core and all P-monomers are outside of the core. Since the H-cores are considered in the order of decreasing contacts, the first successful threading results in a structure with global minimal energy. Note that at this point the algorithm has <it>proven </it>that there is no structure of <it>S </it>that forms more HH-contacts.</p>
               <p>Technically, the threading of a sequence through a core is performed by a constraint program. For this purpose, we formulate the threading problem as a constraint satisfaction problem (CSP) <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. It constrains the H-monomers of the sequence to the positions in the H-core. Further, it enforces successive monomers along the sequence to be neighbored in the lattice and prohibits the multiple use of a single position. The constraint-programming machinery allows for the enumeration of all valid placements according to the given constraints. In this way, all (sub)optimal structures for a given sequence can be calculated. For a more detailed description of the CSP definition and the mechanisms for solving it see <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
            </sec>
            <sec>
               <st>
                  <p>Advanced Features</p>
               </st>
               <p>All resulting structures of HPSTRUCT are returned in absolute move string representation. This compactly encodes the lattice position vectors between successive monomers in the structure and reduces the space consumption for huge data sets.</p>
               <p>To handle the common case of highly degenerated sequences (with many optima), HPSTRUCT offers the possibility to limit the number of predicted structures or to generate only a representing subset. Such a subset only contains structures that are separated by at least (a user defined) distance <it>k</it>. The distance measure is the hamming distance on the absolute move strings.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>HPdeg</p>
            </st>
            <sec>
               <st>
                  <p>Motivation</p>
               </st>
               <p>The degeneracy of an HP-sequence <it>S </it>is the number of optimal structures <it>S </it>can adopt. It can be calculated using HPDEG and is the base to determine the stability of structures <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. HPDEG specializes HPSTRUCT and completely counts all optimal structures.</p>
               <p>An important application of HPDEG is the classification of sequences as protein-like or not. A sequence is protein-like if it can adopt only one optimal structure (degeneracy 1), a definition applied by Li <it>et al. </it><abbrgrp><abbr bid="B10">10</abbr></abbrgrp> and Huard <it>et al. </it><abbrgrp><abbr bid="B4">4</abbr></abbrgrp> among others.</p>
            </sec>
            <sec>
               <st>
                  <p>Methods</p>
               </st>
               <p>HPDEG is directly based on the CPSP-approach to compute the degeneracy. Here, all solutions for all arbitrary H-cores/CSPs are calculated. In addition, a significant acceleration of the process can be achieved by the search decomposition methods we introduced in <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. This is done by identifying sub-chains of the sequence that can be placed independently from each other. Their placements are calculated separately and the resulting numbers are multiplied to the overall structure number of the whole chain. This decomposition strategy results in a speedup of 3-times and higher on average.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>HPdesign</p>
            </st>
            <sec>
               <st>
                  <p>Motivation</p>
               </st>
               <p>HPDESIGN solves the inverse folding problem, i.e. the design of sequences that form a given structure <it>X </it>as their unique optimum. It allows deeper investigations of sequence-structure relations and a better understanding of general properties of protein folding <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>.</p>
               <p>The inverse folding problem (IFP) in 3D lattices has been shown by Berman <it>et al. </it><abbrgrp><abbr bid="B7">7</abbr></abbrgrp> to be NP-complete, i.e. it is, as the protein folding problem, a hard computational problem. In contrast, as the same authors show, the IFP in the simple 2D lattice is solvable in polynomial time. This indicates once more the higher complexity of three-dimensional lattice models. To our knowledge, HPDESIGN is the only method applicable to a 3D-model that calculates the desired sequence properties without exhaustive sequence space enumeration.</p>
            </sec>
            <sec>
               <st>
                  <p>Methods</p>
               </st>
               <p>The approach is based on the CPSP H-core database in order to get a set of good candidate sequences <it>C</it>. First, using H-cores ordered by decreasing size and optimality, a matching of the core and the structure is done. For each match a candidate sequence is derived and added to <it>C</it>. Afterwards, each <it>c </it>&#8712; <it>C </it>is evaluated concerning degeneracy and checked if <it>X </it>is its optimal structure.</p>
               <p>The candidate set <it>C</it>, produced by the filtering step using the H-cores, consists of sequences that can adopt <it>X </it>with an optimal or slightly sub-optimal H-core. Therefore, their probability to form <it>X </it>as their unique optimum is very high and the size of <it>C </it>very small compared to the whole sequence space. The latter is of high importance for the performance of the method.</p>
            </sec>
            <sec>
               <st>
                  <p>Advanced Features</p>
               </st>
               <p>Often sequences with a special ratio of H/P occurrences or with only limited degeneracy are of interest. Both can be specified using HPDESIGN.</p>
               <p>Furthermore, the number of evaluated H-cores is selectable to allow a balancing between runtime and completeness. This is done by adjusting their allowed level of optimality used in the filtering step.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>HPnnet</p>
            </st>
            <sec>
               <st>
                  <p>Motivation</p>
               </st>
               <p>The organisation of sequence space in neutral networks provides insights into evolutionary principles <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B30">30</abbr></abbrgrp>. Such networks can be expanded using HPNNET. A neutral network for a given structure <it>X </it>is an undirected binary graph, where each node represents a sequence that forms <it>X </it>as its unique optimal structure. Edges connect evolutionary related sequences, i.e. sequences that differ only in one sequence position, a point mutation. HPNNET expands a neutral network starting from an initial sequence (or a set of sequences) <it>S </it>that folds into the structure <it>X</it>.</p>
            </sec>
            <sec>
               <st>
                  <p>Methods</p>
               </st>
               <p>The method follows the generate-and-test paradigm. Recursively, all neighboring sequences of <it>S </it>are tested if they adopt <it>X </it>as their unique optimum. If so, they are added to the network and their neighbors are checked. Therefore, HPNNET is capable of detecting and expanding connected neutral networks of different structures.</p>
            </sec>
            <sec>
               <st>
                  <p>Advanced Features</p>
               </st>
               <p>Running HPNNET with <it>S </it>as the only start sequence results in the connected component of the network <it>S </it>belongs to. However, Blackburne and Hirst <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> have shown by exhaustive enumeration in restricted models that neutral networks may consist of several connected components. To find and study them in complex three-dimensional lattices a combination of HPDESIGN and HPNNET can be used. The independently designed sequences resulting from HPDESIGN have a high chance to belong to different components. HPNNET supports as input such a set of sequences and expands all corresponding connected components. An example is later shown in the results section.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Utility tools</p>
            </st>
            <p>In addition to those described above, CPSP-tools provides a set of utility programs helpful for lattice protein studies. For instance using HPCONVERT, it is possible to convert between absolute move strings, the 3D-position data in XYZ-, Protein Data Bank (PDB-) and Chemical Markup Language (CML-) format. A move string normalization, as well as a conversion into an orientation independent relative move string, is available for a symmetry independent structure comparison.</p>
            <p>HPVIEW interactively visualizes structures in 2D-square, 3D-cubic, and 3D-FCC lattices using the Jmol interface <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Installation and Usage</p>
            </st>
            <p>The package supplies standard installation procedures for Linux based on common tools (GNU automake) and can be compiled and installed easily on current 32- and 64-bit Linux systems (including Cygwin for Microsoft Windows&#8482;). The programs are written in C++ for highest performance and provide a slim text-based user interface for efficient pipelining as required for high-throughput experiments. A web front end is under development.</p>
            <p>All constraint programming based algorithms utilize the open source Gecode system <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>.</p>
            <p>The validity of the algorithms has been tested and confirmed on a large set of benchmark problems. The functionality of H-core database access, structure prediction, and degeneracy computation are collected in the C++ CPSP-library. A complete API is included which allows the embedding, extension, and use of the CPSP approach in new programs.</p>
            <p>To reduce package size, only a small fraction of the H-core database is included in the source package. This already enables the use of CPSP-tools for short sequences. The complete database is available on request.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results and Discussion</p>
         </st>
         <p>For illustration, we provide some scenarios that exemplify the use of CPSP-tools in extending known or enabling new studies. All examples are performed in the unrestricted 3D-cubic lattice with HP-sequences of length 27. Note that for this length there are already more than 10<sup>19 </sup>possible structures, which makes an exhaustive enumeration inapplicable. Table <tblr tid="T1">1</tblr> outlines the performance of programs from CPSP-tools. Table <tblr tid="T2">2</tblr> shows the sequences used for Table <tblr tid="T1">1</tblr>, their optimal energy (<it>E</it>), and degeneracy (<it>deg</it>). All tasks were performed on an Intel P4 3 GHz (using CPSP-2.0.0).</p>
         <tbl id="T1">
            <title>
               <p>Table 1</p>
            </title>
            <caption>
               <p>Exemplary runs and data. Example runs of the exemplified CPSP-tools application scenarios. The corresponding sequences and structures are given in Table 2. The neutral net <it>N </it>is given in Figure 3.</p>
            </caption>
            <tblbdy cols="5">
               <r>
                  <c ca="center">
                     <p>Appl.</p>
                  </c>
                  <c ca="left">
                     <p>Tool</p>
                  </c>
                  <c ca="center">
                     <p>Parameter</p>
                  </c>
                  <c ca="center">
                     <p>Result</p>
                  </c>
                  <c ca="right">
                     <p>Runtime</p>
                  </c>
               </r>
               <r>
                  <c cspan="5">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>1</p>
                  </c>
                  <c ca="left">
                     <p>HPDEG</p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>0</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>471354</p>
                  </c>
                  <c ca="right">
                     <p>2.5 s</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>1</p>
                  </c>
                  <c ca="left">
                     <p>HPDEG</p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>1</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
                  <c ca="right">
                     <p>0.2 s</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>2</p>
                  </c>
                  <c ca="left">
                     <p>HPSTRUCT</p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>0</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p><it>X</it><sub>0</sub>, <it>E </it>= -13</p>
                  </c>
                  <c ca="right">
                     <p>0.01 s</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>2</p>
                  </c>
                  <c ca="left">
                     <p>HPSTRUCT</p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>1</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p><it>X</it><sub>1</sub>, <it>E </it>= -22</p>
                  </c>
                  <c ca="right">
                     <p>0.06 s</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>3</p>
                  </c>
                  <c ca="left">
                     <p>HPNNET</p>
                  </c>
                  <c ca="center">
                     <p><it>X</it><sub>1</sub>, <it>S</it><sub>1</sub>, <it>deg </it>= 1</p>
                  </c>
                  <c ca="center">
                     <p><it>S</it><sub>1 </sub>.. <it>S</it><sub>4</sub></p>
                  </c>
                  <c ca="right">
                     <p>9 s</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>HPDESIGN</p>
                  </c>
                  <c ca="center">
                     <p><it>X</it><sub>1</sub>, <it>minH </it>= 17, <it>so </it>= 2</p>
                  </c>
                  <c ca="center">
                     <p><it>S</it><sub>1 </sub>.. <it>S</it><sub>12</sub></p>
                  </c>
                  <c ca="right">
                     <p>13 m 43 s</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>4</p>
                  </c>
                  <c ca="left">
                     <p>HPNNET</p>
                  </c>
                  <c ca="center">
                     <p><it>X</it><sub>1</sub>, <it>S</it><sub>1 </sub>.. <it>S</it><sub>12</sub>, <it>deg </it>= 1</p>
                  </c>
                  <c ca="center">
                     <p><it>N, S</it><sub>1 </sub>.. <it>S</it><sub>14</sub></p>
                  </c>
                  <c ca="right">
                     <p>1 m</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <tbl id="T2">
            <title>
               <p>Table 2</p>
            </title>
            <caption>
               <p>Data of exemplary runs. </p>
            </caption>
            <tblbdy cols="4">
               <r>
                  <c ca="center">
                     <p>id</p>
                  </c>
                  <c ca="center">
                     <p>Sequence</p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>E</it>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>deg</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c cspan="4">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>0</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>PPHPPHHHPHPPPHPHHHPPHPPHHPP</p>
                  </c>
                  <c ca="center">
                     <p>-13</p>
                  </c>
                  <c ca="center">
                     <p>471354</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>1</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPHPHPHPHPHPHHHHHHPH</p>
                  </c>
                  <c ca="center">
                     <p>-22</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>2</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPHHHPHPHPHPHHHHHHPH</p>
                  </c>
                  <c ca="center">
                     <p>-23</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>3</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPHPHPHPHHHPHHHHHHPH</p>
                  </c>
                  <c ca="center">
                     <p>-23</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>4</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPHHHPHPHHHPHHHHHHPH</p>
                  </c>
                  <c ca="center">
                     <p>-24</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>5</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPHPHHHPHPHHHPHHPHHH</p>
                  </c>
                  <c ca="center">
                     <p>-23</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>6</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPHPHPHPHHHPHPHHPHPH</p>
                  </c>
                  <c ca="center">
                     <p>-22</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>7</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPHPHHHPHHHHHPHHPHHH</p>
                  </c>
                  <c ca="center">
                     <p>-24</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>8</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPPPHPHPHHHPHPHHPHPH</p>
                  </c>
                  <c ca="center">
                     <p>-20</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>9</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPHPHHHPHHHPHPHHPHPH</p>
                  </c>
                  <c ca="center">
                     <p>-22</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>10</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPHPHPHPHHHPHPHHPHHH</p>
                  </c>
                  <c ca="center">
                     <p>-22</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>11</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPHPHHHPHPHPHPHHPHHH</p>
                  </c>
                  <c ca="center">
                     <p>-22</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>12</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPHPHHHPHHHPHPHHPHHH</p>
                  </c>
                  <c ca="center">
                     <p>-23</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>13</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPHPHPHPHPHPHPHHPHPH</p>
                  </c>
                  <c ca="center">
                     <p>-21</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>S</it>
                        <sub>14</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>HHHHHPHHPHPHPHPHPHPHPHHPHHH</p>
                  </c>
                  <c ca="center">
                     <p>-21</p>
                  </c>
                  <c ca="center">
                     <p>1</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>X</it>
                        <sub>0</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>FLUFDDRBLBULFLDRFFUBULDDDR</p>
                  </c>
                  <c cspan="2" ca="center">
                     <p>
                        <it>S</it>
                        <sub>0</sub>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>
                        <it>X</it>
                        <sub>1</sub>
                     </p>
                  </c>
                  <c ca="center">
                     <p>FLUURDBULLFFRRDDLLBBRULFFR</p>
                  </c>
                  <c cspan="2" ca="center">
                     <p><it>S</it><sub>1 </sub>.. <it>S</it><sub>14</sub></p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>The corresponding sequences and structures for the exemplary runs of CPSP-tools in the 3D-cubic lattice. For each sequence its optimal energy (<it>E</it>) and degeneracy (<it>deg</it>) is listed. The optimal structures of the sequences are given in absolute move string representation (Forward, Backward, Left, Right, Up and Down). The corresponding neutral net of sequences <it>S</it>1 .. <it>S</it>14 is given in Figure 3.</p>
            </tblfn>
         </tbl>
         <p>(1) Studies of sequence or structure features of proteins as done by Huard <it>et al. </it><abbrgrp><abbr bid="B4">4</abbr></abbrgrp> require a classification of sequences as protein-like. One way is to classify them by the number of optimal structures, i.e. their degeneracy. The fast calculation of this sequence property by HPDEG allows production of sufficiently large benchmark sets for detailed studies. To illustrate this, we run HPDEG for a random HP-sequence <it>S</it><sub>0 </sub>revealing an enormous degeneracy, which is a frequent finding in the HP-model. As a starting point for the following scenarios, we evaluate the degeneracy of <it>S</it><sub>1</sub>, a sequence with a single optimal structure. The very short runtimes for both checks are given in Table <tblr tid="T1">1</tblr>.</p>
         <p>(2) Calculating the globally optimal structure for a given sequence is the main task in many studies, e.g. see Jacob and Unger <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Furthermore, in stochastic folding simulation approaches knowing the minimal possible energy is favorable. Both can be calculated extremely rapidly using HPSTRUCT. Again, We demonstrate this with sequences <it>S</it><sub>0 </sub>and <it>S</it><sub>1</sub>. This results in an energy of -13 and -22 and the optimal structures <it>X</it><sub>0 </sub>and <it>X</it><sub>1</sub>, respectively. Both structures are visualized in Figure <figr fid="F2">2</figr>.</p>
         <fig id="F2">
            <title>
               <p>Figure 2</p>
            </title>
            <caption>
               <p>Structures in 3D-cubic lattice</p>
            </caption>
            <text>
               <p><b>Structures in 3D-cubic lattice</b>. An optimal structure <it>X</it><sub>0 </sub>for sequence <it>S</it><sub>0 </sub>and the unique optimal structure <it>X</it><sub>1 </sub>of <it>S</it><sub>1 </sub>from Table 2 in the 3D-cubic lattice. The coloring shows H-monomers in green and P-monomers in grey.</p>
            </text>
            <graphic file="1471-2105-9-230-2"/>
         </fig>
         <p>(3) To study protein evolution on the sequence level, neutral networks are widely utilized <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Using HPNNET we can span the connected component of the neutral network for a given sequence with a unique optimal structure. Applied to <it>S</it><sub>1 </sub>with <it>X</it><sub>1 </sub>we find four sequences <it>S</it><sub>2 </sub>.. <it>S</it><sub>4 </sub>sharing <it>X</it><sub>1 </sub>as their unique optimal structure. Note, this can be done <it>without </it>exhaustive sequence enumeration for a given structure.</p>
         <p>(4) The detailed study of neutral networks by Blackburne and Hirst <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> has shown that neutral networks may decompose into connected components. Their results are based on full enumeration of sequences and structures in the diamond lattice. This approach does not extend to complex lattice models due to the enormous size of the structure space as discussed above.</p>
         <p>HPDESIGN can overcome that problem by directly designing sequences of the neutral network. Recall that the neutral network contains only sequences with the same unique optimal structure. The described design approach allows one to generate sequences of independent components in the neutral network without exhaustive enumeration. Afterwards, the full components can be expanded via HPNNET.</p>
         <p>We apply this approach to the neutral network of the structure <it>X</it><sub>1</sub>. HPDESIGN calculates 12 members of the network (<it>S</it><sub>1 </sub>.. <it>S</it><sub>12</sub>), including the four sequences <it>S</it><sub>1 </sub>.. <it>S</it><sub>4 </sub>known from scenario (3). Expanding the network <it>N </it>from these sequences via HPNNET reveals two further sequences <it>S</it><sub>13</sub>, <it>S</it><sub>14 </sub>and two independent connected components as shown in Figure <figr fid="F3">3</figr>.</p>
         <fig id="F3">
            <title>
               <p>Figure 3</p>
            </title>
            <caption>
               <p>Neutral net</p>
            </caption>
            <text>
               <p><b>Neutral net</b>. Known independent components of the neutral network for structure <it>X</it><sub>1 </sub>from Table 2 in the 3D-cubic lattice. The border size corresponds to the node degree. The structure is visualized in Figure 2.</p>
            </text>
            <graphic file="1471-2105-9-230-3"/>
         </fig>
         <fig id="F4">
            <title>
               <p>Figure 4</p>
            </title>
            <caption>
               <p>H-core database statistics</p>
            </caption>
            <text>
               <p><b>H-core database statistics</b>. The number of different H-cores for several number of H-monomers (H-core size) in the 3D-cubic lattice. The three curves represent different levels of optimality of the H-cores.</p>
            </text>
            <graphic file="1471-2105-9-230-4"/>
         </fig>
         <p>Preliminary studies performed with CPSP-tools indicate that neutral networks as large as <it>N </it>with several large independent components are rare in the unrestricted 3D-cubic model.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>For complex 3D models, mainly heuristic and/or stochastic approaches to search for optimal structures of a given sequence are available <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B33">33</abbr></abbrgrp>. However, these methods are (a) incomplete and (b) cannot ensure the global optimality of the predicted structures. In consequence, the investigation of problems requiring this information was only possible using exhaustive enumeration, which is not possible for longer sequence lengths.</p>
         <p>The CPSP approach is as fast as common stochastic methods <it>but ensures </it>that all predicted structures are globally optimal, and that none are missing. This is done without exhaustive structure space exploration applying constraint-programming techniques. Therefore, it is well suited to many studies in complex 3D models; especially for finding protein-like sequences, the investigation of neutral networks or sequence design. Further applications range from the generation of candidate sets to the validation of results of folding simulations and stochastic optimization methods.</p>
         <p>The CPSP-tools package combines several applications in the field of bioinformatics concerning 3D lattice proteins. It allows advanced investigation of problems related to protein structure prediction, sequence evolution, inverse folding, and energy landscapes.</p>
      </sec>
      <sec>
         <st>
            <p>Availability and requirements</p>
         </st>
         <p><b>Project name</b>: CPSP-tools</p>
         <p><b>Project home page</b>: <url>http://www.bioinf.uni-freiburg.de/sw/cpsp/</url></p>
         <p><b>Operating system(s)</b>: all Linux based systems (including Cygwin for MS Windows&#8482;)</p>
         <p><b>Programming language</b>: C++</p>
         <p><b>Other requirements</b>: Gecode and BIU library (a source bundle is provided)</p>
         <p><b>License</b>: BSD-style license</p>
         <p><b>Any restrictions to use by non-academics</b>: none</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>Implementation and software design was done by MM and SW. The CPSP method was developed by RB and SW and extended by SW and MM. The CPSP derived algorithms are designed by all authors. All authors have approved and contributed to the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>Martin Mann is supported by the EU project EMBIO (EC contract number 012835). Sebastian Will is partially supported by the EU Network of Excellence REWERSE (project reference number 506779).</p>
            <p>Further, thanks to the reviewers of an earlier version of the manuscript for their helpful comments and Rhodri Saunders for proofreading.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>CPSP-tools</p>
            </title>
            <url>http://www.bioinf.uni-freiburg.de/sw/cpsp/</url>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Different mechanistic requirements for prokaryotic and eukaryotic chaperonins: a lattice study</p>
            </title>
            <aug>
               <au>
                  <snm>Jacob</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Horovitz</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Unger</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>23</volume>
            <fpage>240</fpage>
            <lpage>248</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btm180</pubid>
                  <pubid idtype="pmpid" link="fulltext">17018534</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Exploring the lower part of discrete polymer model energy landscapes</p>
            </title>
            <aug>
               <au>
                  <snm>Wolfinger</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Will</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hofacker</snm>
                  <fnm>IL</fnm>
               </au>
               <au>
                  <snm>Backofen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Stadler</snm>
                  <fnm>PF</fnm>
               </au>
            </aug>
            <source>Europhysics Lett</source>
            <pubdate>2006</pubdate>
            <volume>74</volume>
            <fpage>725</fpage>
            <lpage>732</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1209/epl/i2005-10577-0</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Modelling sequential protein folding under kinetic control</p>
            </title>
            <aug>
               <au>
                  <snm>Huard</snm>
                  <fnm>FP</fnm>
               </au>
               <au>
                  <snm>Deane</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Woo</snm>
                  <fnm>GR</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <fpage>202</fpage>
            <lpage>210</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btl248</pubid>
                  <pubid idtype="pmpid" link="fulltext">16287938</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Finding the lowest free energy conformation of a protein is an NP-hard problem: proof and implications</p>
            </title>
            <aug>
               <au>
                  <snm>Unger</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Moult</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Bull Math Biol</source>
            <pubdate>1993</pubdate>
            <volume>55</volume>
            <fpage>1183</fpage>
            <lpage>1198</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8281131</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Protein folding in the hydrophobic-hydrophilic (HP) model is NP-complete</p>
            </title>
            <aug>
               <au>
                  <snm>Berger</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Leighton</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>J Comp Biol</source>
            <pubdate>1998</pubdate>
            <volume>5</volume>
            <fpage>27</fpage>
            <lpage>40</lpage>
         </bibl>
         <bibl id="B7">
            <title>
               <p>The protein sequence design problem in canonical model on 2D and 3D lattices</p>
            </title>
            <aug>
               <au>
                  <snm>Berman</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>DasGupta</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Mubayi</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Sloan</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Tur&#225;n</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Combinatorial Pattern Matching</source>
            <publisher>Springer</publisher>
            <pubdate>2004</pubdate>
            <volume>3109</volume>
            <fpage>244</fpage>
            <lpage>253</lpage>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Growth-based optimization algorithm for lattice heteropolymers</p>
            </title>
            <aug>
               <au>
                  <snm>Hsu</snm>
                  <fnm>HP</fnm>
               </au>
               <au>
                  <snm>Mehra</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Nadler</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Grassberger</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Phys Rev E</source>
            <pubdate>2003</pubdate>
            <volume>68</volume>
            <fpage>021113</fpage>
            <xrefbib>
               <pubid idtype="doi">10.1103/PhysRevE.68.021113</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Forces of tertiary structural organization in globular proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Yue</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dill</snm>
                  <fnm>KA</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci</source>
            <pubdate>1995</pubdate>
            <volume>92</volume>
            <fpage>146</fpage>
            <lpage>150</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">42834</pubid>
                  <pubid idtype="pmpid" link="fulltext">7816806</pubid>
                  <pubid idtype="doi">10.1073/pnas.92.1.146</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Designability of protein structures: a lattice-model study using the Miyazawa-Jernigan matrix</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Tang</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Wingreen</snm>
                  <fnm>NS</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2002</pubdate>
            <volume>49</volume>
            <fpage>403</fpage>
            <lpage>412</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.10239</pubid>
                  <pubid idtype="pmpid" link="fulltext">12360530</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Population dynamics simulations of functional model proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Blackburne</snm>
                  <fnm>BP</fnm>
               </au>
               <au>
                  <snm>Hirst</snm>
                  <fnm>JD</fnm>
               </au>
            </aug>
            <source>J Chem Phys</source>
            <pubdate>2005</pubdate>
            <volume>123</volume>
            <fpage>154907</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1063/1.2056545</pubid>
                  <pubid idtype="pmpid" link="fulltext">16252972</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>The complexity and accuracy of discrete state models of protein structure</p>
            </title>
            <aug>
               <au>
                  <snm>Park</snm>
                  <fnm>BH</fnm>
               </au>
               <au>
                  <snm>Levitt</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1995</pubdate>
            <volume>249</volume>
            <fpage>493</fpage>
            <lpage>507</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1995.0311</pubid>
                  <pubid idtype="pmpid" link="fulltext">7783205</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>A constraint-based approach to fast and exact structure prediction in three-dimensional protein models</p>
            </title>
            <aug>
               <au>
                  <snm>Backofen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Will</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Constraints</source>
            <pubdate>2006</pubdate>
            <volume>11</volume>
            <fpage>5</fpage>
            <lpage>30</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1007/s10601-006-6848-8</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>A lattice statistical mechanics model of the conformational and sequence spaces of proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Lau</snm>
                  <fnm>KF</fnm>
               </au>
               <au>
                  <snm>Dill</snm>
                  <fnm>KA</fnm>
               </au>
            </aug>
            <source>Macromolecules</source>
            <pubdate>1989</pubdate>
            <volume>22</volume>
            <fpage>3986</fpage>
            <lpage>3997</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1021/ma00200a030</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Recombinatoric exploration of novel folded structures: a heteropolymer-based model of protein evolutionary landscapes</p>
            </title>
            <aug>
               <au>
                  <snm>Cui</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Wong</snm>
                  <fnm>WH</fnm>
               </au>
               <au>
                  <snm>Bornberg-Bauer</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>HS</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>809</fpage>
            <lpage>814</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">117387</pubid>
                  <pubid idtype="pmpid" link="fulltext">11805332</pubid>
                  <pubid idtype="doi">10.1073/pnas.022240299</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>A tale of two tails: why are terminal residues of proteins exposed?</p>
            </title>
            <aug>
               <au>
                  <snm>Jacob</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Unger</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>23</volume>
            <fpage>225</fpage>
            <lpage>230</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1093/bioinformatics/btl318</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Three-dimensional functional model proteins: structure function and evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Blackburne</snm>
                  <fnm>BP</fnm>
               </au>
               <au>
                  <snm>Hirst</snm>
                  <fnm>JD</fnm>
               </au>
            </aug>
            <source>J Chem Phys</source>
            <pubdate>2003</pubdate>
            <volume>119</volume>
            <fpage>3453</fpage>
            <lpage>3460</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1063/1.1590310</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Number of n-step self-avoiding walks on cubic lattice</p>
            </title>
            <aug>
               <au>
                  <snm>Sloane</snm>
                  <fnm>NJA</fnm>
               </au>
            </aug>
            <source>On-Line Encyclopedia of Integer Sequences</source>
            <pubdate>2007</pubdate>
            <url>http://www.research.att.com/~njas/sequences/A001412</url>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Constraint Logic Programming approach to protein structure prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Dal Palu</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Dovier</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Fogolari</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>186</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">539352</pubid>
                  <pubid idtype="pmpid" link="fulltext">15571634</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-5-186</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>PSICO: Solving protein structures with constraint programming and optimization</p>
            </title>
            <aug>
               <au>
                  <snm>Krippahl</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Barahona</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Constraints</source>
            <pubdate>2002</pubdate>
            <volume>7</volume>
            <fpage>317</fpage>
            <lpage>331</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1023/A:1020577603762</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Modeling protein complexes with BiGGER</p>
            </title>
            <aug>
               <au>
                  <snm>Krippahl</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Moura</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Palma</snm>
                  <fnm>PN</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2003</pubdate>
            <volume>52</volume>
            <fpage>19</fpage>
            <lpage>23</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.10387</pubid>
                  <pubid idtype="pmpid" link="fulltext">12784362</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Protein folding in the generalized hydrophobic-polar model on the triangular lattice</p>
            </title>
            <aug>
               <au>
                  <snm>Decatur</snm>
                  <fnm>SE</fnm>
               </au>
            </aug>
            <pubdate>1996</pubdate>
            <note>[Technical Memo MIT-LCS-TM-559, Massachusetts Institute of Technology].</note>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Residue coordination in proteins conforms to the closest packing of spheres</p>
            </title>
            <aug>
               <au>
                  <snm>Bagci</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Jernigan</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Bahar</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Polymer</source>
            <pubdate>2002</pubdate>
            <volume>43</volume>
            <fpage>451</fpage>
            <lpage>459</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1016/S0032-3861(01)00427-X</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>A structural model of latent evolutionary potentials underlying neutral networks in proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Wroe</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>HS</fnm>
               </au>
               <au>
                  <snm>Bornberg-Bauer</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>HFSP J</source>
            <pubdate>2007</pubdate>
            <volume>1</volume>
            <fpage>79</fpage>
            <lpage>87</lpage>
            <xrefbib>
               <pubid idtype="doi">10.2976/1.2739116</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Optimally Compact Finite Sphere Packings &#8211; Hydrophobic Cores in the FCC</p>
            </title>
            <aug>
               <au>
                  <snm>Backofen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Will</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Proc of the 12th Annual Symposium on Combinatorial Pattern Matching</source>
            <publisher>Springer</publisher>
            <pubdate>2001</pubdate>
            <volume>2089</volume>
            <fpage>257</fpage>
            <lpage>272</lpage>
         </bibl>
         <bibl id="B26">
            <aug>
               <au>
                  <snm>Marriott</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Stuckey</snm>
                  <fnm>PJ</fnm>
               </au>
            </aug>
            <source>Programming with Constraints: an Introduction</source>
            <publisher>The MIT Press</publisher>
            <pubdate>1998</pubdate>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Modeling the effects of mutations on the denatured states of proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Shortle</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>HS</fnm>
               </au>
               <au>
                  <snm>Dill</snm>
                  <fnm>KA</fnm>
               </au>
            </aug>
            <source>Prot Sci</source>
            <pubdate>1992</pubdate>
            <volume>1</volume>
            <fpage>201</fpage>
            <lpage>215</lpage>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Counting protein structures by DFS with dynamic decomposition</p>
            </title>
            <aug>
               <au>
                  <snm>Will</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mann</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc of Workshop on Constraint Based Methods for Bioinformatics</source>
            <pubdate>2006</pubdate>
            <fpage>83</fpage>
            <lpage>90</lpage>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Structure-approximating inverse protein folding problem in the 2D HP model</p>
            </title>
            <aug>
               <au>
                  <snm>Gupta</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Manuch</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Stacho</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>J Comp Biol</source>
            <pubdate>2005</pubdate>
            <volume>12</volume>
            <fpage>1328</fpage>
            <lpage>1345</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1089/cmb.2005.12.1328</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Networks in molecular evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Schuster</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Stadler</snm>
                  <fnm>PF</fnm>
               </au>
            </aug>
            <source>Complexity</source>
            <pubdate>2002</pubdate>
            <volume>8</volume>
            <fpage>34</fpage>
            <lpage>42</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1002/cplx.10052</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Jmol: an open-source Java viewer for chemical structures in 3D</p>
            </title>
            <url>http://jmol.sourceforge.net/</url>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Gecode &#8211; generic constraint development environment</p>
            </title>
            <url>http://www.gecode.org</url>
         </bibl>
         <bibl id="B33">
            <title>
               <p>An ant colony optimisation algorithm for the 2D and 3D hydrophobic polar protein folding problem</p>
            </title>
            <aug>
               <au>
                  <snm>Shmygelska</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hoos</snm>
                  <fnm>HH</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <fpage>30</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">555464</pubid>
                  <pubid idtype="pmpid" link="fulltext">15710037</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-6-30</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
