Abstract
Background
Given a protein's amino acid sequence, the protein structure prediction problem is to find a three dimensional structure that has the native energy level. For many decades, it has been one of the most challenging problems in computational biology. A simplified version of the problem is to find an onlattice selfavoiding walk that minimizes the interaction energy among the amino acids. Local search methods have been preferably used in solving the protein structure prediction problem for their efficiency in finding very good solutions quickly. However, they suffer mainly from two problems: revisitation and stagnancy.
Results
In this paper, we present an efficient local search algorithm that deals with these two problems. During search, we select the best candidate at each iteration, but store the unexplored second best candidates in a set of elite conformations, and explore them whenever the search faces stagnation. Moreover, we propose a new nonisomorphic encoding for the protein conformations to store the conformations and to check similarity when applied with a memory based search. This new encoding helps eliminate conformations that are equivalent under rotation and translation, and thus results in better prevention of revisitation.
Conclusion
On standard benchmark proteins, our algorithm significantly outperforms the stateofthe art approaches for HydrophobicPolar energy models and Face Centered Cubic Lattice.
Background
Proteins are the most important of all organisms present in the living cell. Given a protein's amino acid sequence, the protein structure prediction (PSP) problem is to find a three dimensional native structure that has the lowest free energy. In order to function properly, the protein has to fold into its native structure. Misfolded proteins cause many critical diseases such as Alzheimer's disease, Cystic fibrosis, and Mad Cow disease. Knowledge about this native structure is of paramount importance and can have an enormous impact on the field of drug discovery. Not much is known about the folding process and the nature of the energy function is also very complex. For many decades, it has been considered one of the hardest problems in biology. In vitro laboratory methods like Xray crystallography and Nuclear Magnetic Resonance (NMR) spectroscopy are very much slow and expensive. For these issues, many researchers from other fields are attracted to solve the problem using their own techniques [1,2].
Computational methods applied to PSP fall into three broad categories: ab initio, homology modeling and protein threading. The later two methods depend on the templates (or structures) of known proteins and are useful only when matching templates are found. Research in ab initio PSP has been instigated by the famous Anfinsen's dogma. In 1973 Nobel Prize Laureate Christian B. Anfinsen suggested that the native structure of a globular protein is determined only by its primary amino acid sequence [3]. The ab initio PSP can be viewed as a search problem, where one has to find a stable, unique, and kinetically accessible native structure from the space of all possible structures (also called conformations). The search space for this problem, even in the simplified models, contains an astronomically large number of conformations. Therefore, systematic search techniques are almost impractical since they perform exhaustive search and requires a huge amount of computational resources. In contrast, local search methods are normally very quick in finding good solutions, although they suffer from revisitation and stagnation, and require good heuristics.
Performance of the computational methods also degrades when applied to the high resolution models that deal with real structures of proteins. This is due to three reasons: i) the unknown contributing factors of different forces to the energy functions, ii) protein models with atomic level details require huge computational effort, and iii) the space of possible conformations is very large and complex. For these reasons, the general paradigm of de novo PSP is to begin with the sampling of a large set of candidate (decoy) structures guided by a scoring function. In the final stage, the refinements are done to achieve the real structure. The simplified models, though lack many details, provide a realistic backbone for the proteins and can be refined to get real structures [4].
Local search algorithms when applied to large proteins (sequence length around 200 monomers) suffer from a huge number of revisitation and stagnation. To handle these issues, a number of techniques have been applied in the literature of PSP [57] that include tabu lists, adaptive measures, and various restart mechanisms. Similar approaches have also been used in other domains such as propositional satisfiability [8] and quadratic assignment problem [9]. Many of the algorithms apply random restarts or restart from the best local minimum [6,7]; which do not solve the problem in general.
Our contribution
In this paper, we present a new algorithm for the simplified protein structure prediction problem. During the search, our method selects the best candidate in each iteration, but memorizes the second best conformations that are generated but not selected or explored (called elite conformations) at each iteration. Whenever the search faces stagnation, we select the best conformation from this elite set and continue search from there. This retreat helps the search diverge. Similar techniques have been used in the systematic search techniques like A* search, but they require a huge amount of memory to store the unexplored frontier. We maintain only a small set of previously generated conformations by discarding conformations with similar fitness. It reduces the memory requirement and provides a mechanism to go back to earlier conformations with lower fitness value but with potential to lead towards better search regions. We also propose a new nonisomorphic encoding that reduce the nonunique or isomorphic conformations from the search space and makes the similarity matching of the conformations efficient. These isomorphic conformations are essentially same and show differences only because of the translational and rotational symmetry. We applied this encoding in our algorithm along with the long term memory of local minima proposed in [10]. Experimental results show that our algorithm significantly outperforms the stateoftheart algorithms on standard benchmark proteins using HydrophobicPolar(HP) energy model and Face Centered Cubic (FCC) lattice.
Related work
Lau and Dill [1] proposed a simplified HP energy model for protein structure prediction problem. It is proved to be a hard combinatorial problem [11]. Due to the complexity, several techniques and their hybridizations have been applied to solve the problem. The similarity with the thermodynamic nature of the protein folding allured the researchers to apply simulated annealing [12,13]. Genetic algorithms were first applied to solve this problem by Unger and Moult [14]. The basic genetic algorithm was subsequently improved by many researchers [1517].
Yue and Dill [18] applied constraint based approaches for the first time and developed the Constraint Based Hydrophobic Core Construction (CHCC) algorithm. Their method had several pitfalls: CHCC could only support the HP model and failed to report degeneracy or nonunique structures for several protein sequences. The research group of Rolf Backofen developed a Constrainedbased Protein Structure Prediction (CPSP) tool [19], which provided solutions to these problems. However, CPSP tool depends on precalculated cores and does not converge for larger protein sequences. Palu et al. [20] developed COLA solver using highly optimized constraints and propagators to obtain satisfactory results on small and mediumsized instances (length < 80). Lesh et al. [5] provided a novel set of transformations called pull moves extendible to any lattice. Both Lesh et al. [5] and Blazewicz et al. [21] implemented tabu search metaheuristics independent of each other.
Hybrid techniques that combine the power of different strategies provided better results. Using the pull moves, Klau et al. [22] proposed an interactive optimization framework called Human Guided Simple Search (HuGS). Using the same pull move set, Ullah et al. [23] proposed a twostage optimization approach. Furthermore, Ullah et al. [24] combined local search and constraint programming approaches. They introduced a protein folding simulation procedure on FCC lattice and employed the COLA solver [20] to generate neighborhood states for a simulated annealing based local search. They used MJ matrices with 20 × 20 amino acid pairwise interactions. They tested their approaches on some real proteins (length < 80) from the Protein Data Bank (PDB). Jiang et al. [25] combined tabu search strategy (GTS) with genetic algorithms in the twodimensional HP Model.
Cebrian et al. [26] used tabu search to find 3D structures of Harvard instances [27] on FCC lattices for the first time. In their subsequent work, Dotu et al. [6,7] applied Large Neighborhood Search (LNS) to further optimize the results found in [26]. They also improved the tabu search by adopting a new neighborhood selection technique [7]. Both of their methods are implemented in COMET. Shatabda et al. [10] proposed a memory based approach on top of the algorithm proposed by Dotu et al. [7] and improved the results on the FCC lattice and HP energy model. Other methods (such as Simulated Annealing [12], Ant Colony Optimization (ACO) [28], and Extremal Optimization [29]) are also found in the literature.
Materials and methods
Proteins are polymers of amino acid monomers. In a simplified model, all monomers have an equal size and all bonds are of an equal length. Each amino acid monomer is represented by a single point and its position is restricted to a three dimensional lattice. A simplified energy function is used in calculating the energy of a conformation. The given amino acid sequence fits into a fixed lattice, where every two consecutive monomers in the sequence are also neighbor on the lattice (called the chain constraint) and two monomers can not occupy the same lattice point (called the self avoiding constraint).
FCC lattice
The Face Centered Cubic (FCC) lattice is preferred over other lattices since it has the highest packing density [30] for spheres of equal size, and provides the highest degree of freedom for placing an amino acid monomer. Thus, it provides a realistic discrete mapping for proteins. The FCC lattice is generated by the following basis vectors: , , , , , , , , , , ,. Two lattice points p, are said to be in contact or neighbors of each other, if for some vector in the basis of lattice .
HP energy model
The HydrophobicPolar (HP) energy model was proposed by Lau and Dill [1]. In this model, all the amino acids are divided into two groups: hydrophobic H (Gly, Ala, Pro, Val, Leu, Ile, Met, Phe, Tyr, Trp); and hydrophilic or polar P (Ser, Thr, Cys, Asn, Gln, Lys, His, Arg, Asp, Glu). The given amino acid sequence of a protein is represented as a string s of the alphabet {H, P}. The free energy calculation for the HP model, shown in (1), counts only the energy interactions between two nonconsecutive amino acid monomers.
where c_{ij}= 1 only if two monomers i and j are neighbors (or in contact) on the lattice and 0 otherwise. The other term, e_{ij}is calculated depending on the type of amino acids: e_{ij}= 1 if s_{i}= s_{j}= H and 0 otherwise. Minimizing the summation in (1) is equivalent to maximizing the number of nonconsecutive HH contacts. Several other variants of HPmodel [31] exist in the literature.
Using the HP energy model together with the FCC lattice, the simplified PSP problem is defined as: given a sequence s of length n, find a self avoiding walk p_{1 }⋯ p_{n}on the lattice such that the energy defined by (1) is minimized.
Local search framework
The local search framework was originally proposed in [7]. The algorithm is similar to that of the procedure localSearch () presented in Table 1 except in Lines 6, 910 and 14. It depends on a structured randomized initialization method and maintains a simple tabu list to prevent recently used moves. In the framework, moves involving single monomer are only allowed. For any given conformation c and a sequence position i, a move(i, p, c) that moves an amino acid i to a new position p is allowed, if (i) p is free and is in contact with both amino acids at positions i  1 and i + 1, and (ii) i is not in the tabu list. The length of the tabu list takes a random value from [4, n/4], where n is the length of the sequence. The move can be applied to either H or P type of amino acid at each iteration. The fitness function minimizes the summation of HHdistances for all nonconsecutive pairs of Hmonomers. The fitness function can be formally defined as the following:
Table 1. Local Setach Framework.
where dv(i, j) = d(i, j)^{2 }2 and d(i, j) = (x_{i}x_{j})^{2 }+ (y_{i} y_{j})^{2 }+ (z_{i} z_{j})^{2}, i.e. square of the Euclidean distance between the ith and jth amino acids in the current conformation c of a sequence s of length n. The energy level of the structure is still determined by the HP energy value. The fitness function is used to drive the search only. The search algorithm periodically switches the type of the acid and selects the best move on a aminoacid which is not in the tabulist. In case of P moves, it selects a random move since a move of P type amino acid does not affect the fitness function. The search restarts from the previously found best solution whenever the fitness function is not improving for maxStable steps. The memorybased search in [10] extends this local search framework. It stores a proportion of the local minima encountered and whenever a move is selected, it generates the conformation and checks similarity with the stored local minima. If the generated conformation is within a given proximity of a stored local minimum, the conformation is discarded. Hamming distance is used as the similarity measure and relative encoding to represent the conformations.
Our algorithm is developed on top of the memorybased search. The pseudocode for our algorithm is depicted in Table 1. Our algorithm differs from the memorybased approach in Line 14 of Procedure localSearch() where we select a conformation from the elite set at stagnation and in Line 9 of Procedure selectMove() where we store the prominent but not selected candidate conformations into the elite set. It also differs in the encoding of the representation of the conformations. We do that at Line 4 of Procedure selectMove() before matching it with stored local minima and at Line 10 of Procedure localSearch() while storing the local minimum. Rest of this section describes the detail of the procedures of our algorithm.
Elite conformations
In each iteration of a local search, a number of conformations are generated. However, only a few of them are explored in the next iterations. In the case of a single candidate search, only a single conformation, which is typically the best conformation according to the heuristic, is selected for the next iteration. In successive iterations, the search goes on by generating the neighbors of the selected conformations. The other potential conformations with good fitness values are never used as the search is greedy in nature. We call them elite conformations. These conformations, if explored ever, may lead to better search regions. Note that, in the systematic search techniques, these conformations are stored and explored. However, they require a huge amount of memory. Moreover, the selection in a systematic search like A* search depends on a heuristic function that requires the goal to be known beforehand. In our case, the optimal structure is totally unknown and we can not afford to store a huge number of conformations. In our algorithm, we store the second best conformations and explore them whenever the search faces stagnation.
Store
We store the second best conformations in each iteration in a set called elite set. At each iteration, when a move is selected, we update this elite set of conformations. The pseudocode for the updateEliteSet() procedure is given in the right side of Table 2. We use a priority queue sorted in the order of fitness value and iteration number to store the elite conformations. Before inserting a conformation into the priority queue, we check for similarity in the stored local minima list and store it only if no match is found.
Table 2. Pseudocode for Elite Set Methods.
Explore
We select the top element from the priority queue whenever the search stagnates. The search then continues from the selected elite conformation. The search algorithm, guided by the fitness function defined in (2), quickly forms a compact hydrophobic core at the center of the conformation and the greedy search oscillates within the same region of the search space before it can improve the fitness function to break the core or to form some alternate core. The detailed nature of the search is discussed in [10]. The oscillating nature indicates that if we select a conformation from a region in the search space, then we can ignore the other conformations with the same or near fitness value and within the temporal locality. Every time an elite conformation is selected form the list, we do that by discarding a fixed proportion of the top elements from the list. This results in eliminating the conformations that are similar in fitness value and structure, and are also temporally proximate. This retreat effectively helps the search diverge. It also reduces the memory requirement for the priority queue used. The detailed pseudocode of the method is given in the left side of Table 2. The method elitSet.release() at Line 6 releases the top elements from the elite set.
Nonisomorphic encoding
Many techniques have been employed in the literature to represent the protein conformations. These representations allow the search to keep the candidate conformations updated and perform operations like similarity checking (memorybased algorithms) and crossover (genetic algorithms). The most obvious way to represent the conformations is to use Cartesian coordinates of the aminoacid monomers. However, such a representation contains translational symmetry, which can be solved if absolute encoding is used. Absolute encoding is found from the absolute direction vectors between the consecutive points in the aminoacid chain. The alphabet size of the absolute encoding depends on the lattice used. For the FCC lattice, the alphabet size is 12 since the number of basis vectors is 12. However, absolute encoding is not suitable when we check similarity between two conformations since it contains the problem of rotational symmetry. Two identical conformations with rotational symmetry are represented by different absolute encoding (see the example in Figure 1). This type of encoding is called isomorphic encoding. Nonisomorphic encodings provide a solution to this issue. Shatabda et al. [10] used the relative encoding proposed by Backofen et al. [32] in their algorithm. Their encoding scheme starts from a fixed direction and continues to update a base matrix throughout the chain. The efficiency of the algorithm thus depends of the dimension of the lattice. Moreover, a decoding algorithm is needed to get back the absolute encodings or the coordinate points. The computational complexity of their algorithm is O(nl^{3}), where n is the number of absolute directions and l is the dimension of the lattice. The complexity of the decoding algorithm is also O(nl^{3}). A nonisomorphic encoding was also proposed in [33] for cubic lattices that calculates the angles between two consecutive absolute direction vectors and encodes the move sequence. This encoding also costs more as it requires computation of angles between the direction vectors.
Figure 1. Isomorphic Encoding. Two identical structures in cubic lattice having different absolute encoding; structure in the left has the encoding "DSES", and the structure at right with encoding "UNEN", where D = Down U = Up, N = North, S = South, E = East and W = West.
In this paper, we propose a new nonisomorphic encoding, which is generic for any lattice and requires no separate decoding algorithm; the encoding itself maps to the absolute directions. Instead of relative angles, our algorithm depends on the relative occurrence of the absolute directions within the chain. It requires only O(n) time to encode. The pseudocode of our algorithm is given in Table 3. This algorithm calculates the encoding on the fly. It starts with an empty Map and every time a new absolute direction is encountered in the sequence, it assigns the next available code to it. Once the mapping for all possible directions is found then the algorithm is just a simple lookup from the mapping array. In the results section, we show the effectiveness of our encoding scheme when applied to the memorybased search [10].
Table 3. Pseudocode for NonIsomorphic Encoding.
Results and discussion
We implemented our algorithm in C++ and ran experiments on the NICTA (http://www.nicta.com.au webcite) cluster machine. The cluster has a number of machines each equipped with two 6core CPUs (AMD Opteron @2.8 GHz, 3 MB L2/6 M L3 Cache) and 64 GB Memory, running Rocks OS (a Linux variant for cluster). We compared the performance of our algorithm to that of the tabu search by Dotu et al. [7] and the memory based approach proposed in [10]. Algorithms were run 50 times for each of the protein sequences. Each run was given 5 hours to finish. We could not compare our results with the Large Neighborhood Search (LNS) [7] since the COMET program exited with 'too much memory needed' error for the largesized benchmark proteins that we have selected. We do not show results for smallsized Harvard instances (length = 48) or other smaller protein sequences since both algorithms reach near optimal conformations and the difference of the energy levels achieved for these proteins are relatively small.
Results
We show results for two sets of benchmarks in Table 4. The first six proteins are also used by Dotu et al. [7]. The R instances (length = 200) are originally taken from [34] and the f180 instances (length = 200) are provided by Sebastian Will [7]. LSNew denotes our algorithm and LSMem denotes the memorybased approach in [10] and LSTabu denotes the tabu search by Dotu et al. [7]. The best and average energy levels achieved are reported in Table 4. We set proximity measure to 3 and only 5% of the local minima was stored while maxStable was set to 100 for our algorithm. For other algorithms, we set the parameters as recommended by the authors. The best energy levels reported by Dotu et al. [7] are also shown under the column LNS. These results were produced by large neighborhood search. Optimal lower bounds for the minimum energy values for the proteins are also reported under the column 'E_{l}' generated by the CPSP tools [19]. Note that these values are obtained by using exhaustive search methods and are used only to evaluate how far our results are from them. The missing values indicate where no such bound was found and the values marked with * are the values for which the algorithm did not converge even after 24 hours of run.
Table 4. Experimental Results.
We also used a second set of benchmark proteins derived from the famous Critical Assessment of Techniques for Protein Structure Prediction (CASP) competition (http://predictioncenter.org/casp9/targetlist.cgi webcite). These proteins are of length 230 ± 50. Six protein sequences were randomly chosen from the target list. These sequences are then converted into HP sequences. Results for these six proteins are also given in Table 4 (lower part). The PDB ids for each of these proteins are also given. The parameter settings for these six proteins were also kept the same. LNS column contains no data for these six proteins since they were not used in [7].
Analysis
From the average energy levels shown in boldface in Table 4, it is clearly evident that, for all the twelve proteins, our algorithm significantly outperforms both of the algorithms. We performed statistical ttest for independent samples with 95% level of significance to verify the significant difference in performances. We report the new lowest energy levels (w.r.t. incomplete search methods) for all twelve proteins. These energy levels are shown in italicfaced font in Table 4.
Relative improvement
In Table 4, we report the relative achievement in column 'R.I.'. Relative improvement of our approachis measured in terms of the difference with optimal bound of the energy level. This value is significant because it gets harder to find better conformations as the energy level of a protein sequence approaches the optimal. We define:
where E_{o }is the average energy level achieved by our approach, E_{r }is the average energy level achieved by the other approach, and E_{l }is the optimal lower bound of the energy level. The missing values indicate the absence of any lower bound for the corresponding protein sequence. Similar measurements were also used in [10]. From the values reported in Table 4, we clearly see that our algorithm produces conformations that are significantly better in terms of the average energy level achieved.
Search progress
In Figure 2, we show search progress of three algorithms for the protein sequence R1. Average energy level by each of the algorithms for 50 runs are shown. All three algorithms achieve almost the same level of energy initially but as soon as the search makes progress, the tabu search and the memorybased search fail to overcome stagnation. It is clearly evident from the graph that our algorithm continues to improve in the stagnant situations and thus produces better results.
Figure 2. Search Progress. Search progress of three algorithms for Protein R1 over 300 minutes.
Effect of the nonisomorphic encoding
The effects of the new nonisomorphic encoding of the protein conformations have been twofold. Firstly, it resulted in the reduction of degeneracy, which is evident in the number of discarded conformations during the search. Secondly the efficient computation improved the runtime. In the memorybased approach proposed in [10], the authors used the relative encoding proposed in [32]. When applied with the memorybased algorithm proposed in [10], our new encoding resulted in more discards and less computation time, as shown in Table 5. The discarded conformations are the approximate measure of similar conformations encountered during the search. The experimental results for six proteins are shown in Table 5 for first one million iterations.
Table 5. Effect of NonIsomorphic Encoding.
Conclusions
In this paper, we presented a local search algorithm for solving the protein structure prediction problem on FCC lattice using low resolution HP energy model. Experimental results shows that our algorithm outperforms the stateofthe art algorithms. We used a novel encoding scheme to represent the conformations along with a set of elite conformations to handle the stagnation of the local search. We believe that use of domain specific heuristics while selecting the conformations from the elite set can further improve the performance of the algorithm. In future, we wish to explore that and apply our techniques to higher resolutions and other energy models to see the effect. We wish to apply our techniques to other domains such as propositional satisfiability, vehicle routing. We believe the proposed encoding scheme will add efficiency to search techniques such as genetic algorithms.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
SS conceived the original idea of elite conformations and nonisomorphic encoding. All authors contributed significantly in the implementation, experimentation and writing of the manuscript and approved the final version.
Declarations
The publication costs for this article were funded by the corresponding author's institution.
This article has been published as part of BMC Bioinformatics Volume 14 Supplement 2, 2013: Selected articles from the Eleventh Asia Pacific Bioinformatics Conference (APBC 2013): Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/14/S2 webcite.
Acknowledgements
We gratefully acknowledge the support of the Griffith University eResearch Services Team and the use of the High Performance Computing Cluster "Gowonda" to complete this research. We also thank NICTA, which is funded by the Australian Government as represented by the Department of Broadband, Communications and the Digital Economy and the Australian Research Council through the ICT Centre of Excellence program.
References

Lau KF, Dill KA: A lattice statistical mechanics model of the conformational and sequence spaces of proteins.
Macromolecules 1989, 22(10):39863997. Publisher Full Text

Klau GW, Lesh N, Marks J, Mitzenmacher M: Humanguided tabu search.
Proceedings of the 18th National Conference on Artificial Intelligence 2002, 4147.

Anfinsen CB: Principles that govern the folding of protein chains.
Science 1973, 181(4096):223230. PubMed Abstract  Publisher Full Text

Rotkiewicz P, Skolnick J: Fast procedure for reconstruction of fullatom protein models from reduced representations.
Journal of Computational Chemistry 2008, 29(9):14601465. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Lesh N, Mitzenmacher M, Whitesides S: A complete and effective move set for simplified protein folding.
Proceedings of the Seventh Annual International Conference on Research in Computational Molecular Biology 2003, 188195.
RECOMB '03

Dotu I, Cebrián M, Van Hentenryck P, Clote P: Protein structure prediction with large neighborhood constraint programming search. In Principles and Practice of Constraint Programming. Springer; 2008:8296.

Dotu I, Cebrian M, Van Hentenryck P, Clote P: On lattice protein structure prediction revisited.
IEEE/ACM Transactions on Computational Biology and Bioinformatics 2011, 8(6):16201632. PubMed Abstract  Publisher Full Text

Mazure B, Sais L, Grégoire É: Tabu search for SAT.
Proceedings of the National Conference on Artificial Intelligence 1997, 281285.

Battiti R, Tecchiolli G, et al.: The reactive tabu search.
ORSA Journal on Computing 1994, 6:126126. Publisher Full Text

Shatabda S, Newton M, Pham DN, Sattar A: Memorybased local search for simplified protein structure prediction.
Proceedings of the 3rd ACM Conference on Bioinformatics, Computational Biology and Biomedicine 2012, 345352.
BCB '12, ACM

Berger B, Leighton T: Protein folding in the hydrophobichydrophilic(HP) is NPcomplete.
Proceedings of the Second Annual International Conference on Computational Molecular Biology 1998, 3039.
RECOMB '98

Kawai H, Kikuchi T, Okamoto Y: A prediction of tertiary structures of peptide by the Monte Carlo simulated annealing method.
Protein Engineering 1989, 3(2):8594. PubMed Abstract  Publisher Full Text

Kapsokalivas L, Gan X, Albrecht AA, Steinhöfel K: Populationbased local search for protein folding simulation in the MJ energy model and cubic lattices.
Computational Biology and Chemistry 2009, 33(4):283294. PubMed Abstract  Publisher Full Text

Unger R, Moult J: A genetic algorithm for three dimensional protein folding simulations.
Proceedings of the 5th International Conference on Genetic Algorithms 1993, 581588.

Konig R, Dandekar T: Improving genetic algorithms for protein folding simulations by systematic crossover.
Biosystems 1999, 50:1725. PubMed Abstract  Publisher Full Text

Krasnogor N, Hart W, Pelta D: Protein structure prediction with evolutionary algorithms.
Proceedings of the Genetic and Evolutionary Computation conference 1999, 15961601.

Hoque T, Chetty M, Sattar A: Protein folding prediction in 3D FCC HP lattice model using genetic algorithm.

Yue K, Dill K: Forces of tertiary structural organization in globular proteins.
Proc Natl Acad Sci U S A 1995, 92:146150. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Mann M, Backofen R: CPSPtools  Exactand complete algorithms for highthroughput 3 D lattice protein studies.
BMC Bioinformatics 2008, 9:230. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Alessandro DP, Dovier A, Pontelli E: A constraint solver for discrete lattices, its parallelization, and application to protein structure prediction.
SoftwarePractice and Experience 2007, 37:14051449. Publisher Full Text

Blazewicz J, Dill K, Lukasiak P, Milostan M: A tabu search strategy for finding low energy structures of proteins in HPmodel.
Computational Methods in Science and Technology 2004, 10:719.

Klau GW, Lesh N, Marks J, Mitzenmacher M: Humanguided tabu search.
Proceedings of the 18th National Conference on Artificial Intelligence 2002, 4147.

Ullah AD, Kapsokalivas L, Mann M, Steinhöfel K: Protein folding simulation by twostage optimization.
In Computational Intelligence and Intelligent Systems Edited by Cai Z, Li Z, Kang Z, Liu Y. 2009, 138.

Ullah AZMD, Steinhöfel K: A hybrid approach to protein folding problem integrating constraint programming with local search.
BMC Bioinformatics 2010, 11(S1):39. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Jiang T, Cui Q, Shi G, Ma S: Protein folding simulations of the hydrophobichydrophilic model by combining tabu search with genetic algorithms.
Journal of Chemical Physics 2003, 119(8):45924596. Publisher Full Text

Cebrián M, Dotú I, Van Hentenryck P, Clote P: Protein structure prediction on the face centered cubic lattice by local search. In Proceedings of the 23rd National Conference on Artificial Intelligence. Volume 1. AAAI'08, AAAI Press; 2008::241246.

Yue K, Fiebig K, Thomas P, Chan H, Shakhnovich E, Dill K: A test of lattice protein folding algorithms.
Proc Natl Acad Sci U S A 1995, 92:325. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Shmygelska A, Hoos H: An ant colony optimisation algorithm for the 2 D and 3 D hydrophobic polar protein folding problem.
BMC bioinformatics 2005, 6:30. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Lu H, Yang G: Extremal optimization for protein folding simulations on the lattice.
Computers & Mathematics with Applications 2009, 57:18551861.

BornbergBauer E: Chain growth algorithms for HPtype lattice proteins. In Proceedings of the First Annual International Conference on Computational Molecular Biology. RECOMB '97, New York, NY, USA: ACM; 1997:4755.

Backofen R, Will S, Clote P: Algorithmic approach to quantifying the hydrophobic force contribution in protein folding.
Proceedings of the Pacific Symposium on Biocomputing 2000, 92103.

Hoque T, Chetty M, Dooley LS: Nonisomorphic coding in lattice model and its impact for protein folding prediction using genetic algorithm.
Proceedings of IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB) 2006, 18.
IEEE

Backofen R, Will S: A constraintbased approach to structure prediction for simplified protein models that outperforms other existing methods.