| Improved prediction of HIV-1 protease-inhibitor binding energies by molecular dynamics simulations1Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98195, USA 2Department of Clinical Microbiology, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand
BMC Structural Biology 2003, 3:2doi:10.1186/1472-6807-3-2 The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1472-6807/3/2
© 2003 Jenwitheesuk and Samudrala; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL. AbstractBackgroundThe accurate prediction of enzyme-substrate interaction energies is one of the major challenges in computational biology. This study describes the improvement of protein-ligand binding energy prediction by incorporating protein flexibility through the use of molecular dynamics (MD) simulations. ResultsDocking experiments were undertaken using the program AutoDock for twenty-five HIV-1 protease-inhibitor complexes determined by x-ray crystallography. Protein-rigid docking without any dynamics produced a low correlation of 0.38 between the experimental and calculated binding energies. Correlations improved significantly for all time scales of MD simulations of the receptor-ligand complex. The highest correlation coefficient of 0.87 between the experimental and calculated energies was obtained after 0.1 picoseconds of dynamics simulation. ConclusionOur results indicate that relaxation of protein complexes by MD simulation is useful and necessary to obtain binding energies that are representative of the experimentally determined values. BackgroundThe human immunodeficiency virus type 1 aspartic protease (HIV-1 PR) is an important enzyme due to its key role in viral maturation. Inactivation of the enzyme causes the production of immature, noninfectious viral particles. The enzyme therefore is an attractive target in anti-AIDS drug design, and the effect of binding various inhibitors on the protease structure is currently the focus of intensive research [1-3]. To obtain information about the position and energy of binding between an inhibitor and the corresponding protein, several automated docking programs have been developed [4-6]. Given recent improvements in search algorithms and energy functions, computational docking methods have become a valuable tool to probe the interaction between an enzyme and its inhibitors. These methods can contribute significantly to the understanding of structural and energetic basis of enzyme-substrate interactions [7-9]. Protein-ligand docking methods aim to predict the binding energy of the protein-ligand complex given the atomic coordinates. In such calculations, both the protein and ligand can be treated as rigid bodies [10,11]; alternately, the ligand, the protein, or both molecules, can be completely or partially flexible [12,13]. One advantage of incorporating flexibility is that it enables a search without bias introduced by the initial model. This bias normally influences both the orientation and conformation of the ligand in the active site, which usually represents a local minimum conformation [14]. More importantly, the lock and key concepts used to evaluate enzyme-substrate binding, in reality, refer to flexible locks and keys that are both in constant dynamic (thermal) motion [15]. For example, an analysis of conformational changes upon complex formation for a representative set of 39 pairs of ligand-free and ligand-bound structures showed that 50% of these proteins undergo substantial main-chain and side chain conformational changes when the ligand is bound [15]. Several techniques have been developed to predict the binding energies of HIV-1 protease-inhibitor complexes [16-21]. Head RD et al., 1996 [16] used an approach that calculates physicochemical properties of the ligands and the receptor-ligand complexes to estimate the free energy of binding. The enthalpy of binding is calculated by molecular mechanics, while properties such as complementary hydrophobic surface area are used to estimate the entropy of binding through heuristics. Gohlke H et al., 2000 [17] developed DrugScore, a knowledge-based scoring function, to discriminate between well-docked ligand binding modes and those largely deviating from the native structure. Schapira M et al., 1999 [20] used the finite difference Poisson-Boltzmann implementation of the electrostatic term, in conjunction with appropriate surface and entropy terms to predict the binding energy of 13 HIV-1 protease-inhibitor complexes. The predicted binding energies had a correlation coefficient of 0.66 with the experimental data. Recently, Österberg F et al., 2002 used AutoDock 3.0, a ligand flexible docking program, together with combining 21 protease structures into a single representative grid of interaction energies for incorporating the side chain motion into a docking simulation protocol [21]. The correlation coefficient between experimental and calculated binding energies produced by this technique was 0.79. Since most current docking programs treat the receptor and/or ligand as rigid objects, this represents an important cause of failure to predict the correct binding enzyme-substrate energies [18]. We address this problem by performing Molecular Dynamic (MD) simulations on HIV-1 protease-inhibitor complexed and using the resulting structures to calculate the binding energies by AutoDock, a ligand flexible docking program. Results and discussionCorrelation between experimentally determined and calculated binding energyThe primary objective of this study was to determine whether protein relaxation would improve prediction of protein-ligand binding energies. Table 1 shows that the correlation coefficient of the experimentally determined and calculated binding energies from AutoDock for the twenty-five protease-inhibitor complexes after protein-rigid docking was 0.38. The correlations after 0.01, 0.1, 1, and 10 picoseconds (ps) of MD simulations were 0.53, 0.87, 0.79, and 0.74 respectively. These correlations are plotted in Figure 1. One outlier (from 1hvi) was noticed in Figure 1B; after it is eliminated, the correlation coefficient changes from 0.53 to 0.72 after 0.01 ps of simulation.
Table 1. Comparison of experimentally-determined and calculated binding energies for twenty-five HIV-1 protease-inhibitor complexes. All these correlations represent a significant improvement over the protein-rigid docking results. The best results were obtained from the structures at the 0.1 ps MD simulation time point. These structures also have low (≅ 0.3 Å) average all-atom root mean square deviations (RMSD) relative to the experimental results. Table 2 shows the all-atom RMSD between each simulated complex and the corresponding protease-inhibitor x-ray structure. The average all-atom RMSD for the complexes increases from 0.18 Å at 0.01 ps to 2.92 Å at the end of 10 ps of simulation time. Similar results are consistently observed regardless of the five starting seeds used in the MD simulations. Table 2. All-atom root mean square deviation (RMSD) of the protein-ligand complexes relative to their corresponding x-ray structures. When a constant value of 6.5 is subtracted from the predicted energies after 0.1 ps MD, the binding energies of almost all of the predictions were within 2.0 kcal/mol of the experimental values. Three complexes, 1hvi, 1hvr, and 1hte had poor predicted energies, with an average error of 4.34, 4.31, and 5.09 kcal/mol, respectively. Among these, one (1hvi) had better predicted energies when compared to the results of protein-rigid docking. Influence of the protease flap movement on calculated binding energyThe beta-strand flap is the most flexible region in the HIV-1 protease. It is normally 7 Å RMSD from the active site and is in an open conformation in the native state [22,23]. The protease undergoes significant structural changes on binding to an inhibitor. The two flaps fold over the inhibitor to form a tunnel-shaped active site and are held in this close position by hydrogen bonding from Ile50 and Ile50' NH groups of the enzyme to a water molecule, which in turn is hydrogen bonded to the P2 and P1' CO groups of the inhibitor [24]. The bonding stabilizes the flaps in a closed position and inhibits the activities of the enzyme. MD simulation has been used to study the movement of the flap region of HIV-1 protease with a ligand [25-30]. The flaps initially opened to an all-atom RMSD of 25 Å within 200 ps and became completely open at the end of a 10 ns simulation. In this study (Figure 2), the flaps opened up and moved away from the x-ray structure from 0.54 Å at 0.1 ps to 3.30 Å RMSD at 10 ps (the flap RMSD was calculated from residue 40 to 60 of each protein chain). These movements, after 0.1 ps of simulation, are inversely correlated with the quality of the binding energy prediction. As shown in Table 1, the correlation coefficient significantly decreased from 0.87 at 0.1 ps to 0.74 at 10 ps as the all-atom flap RMSD increased from 0.54 to 3.30 at 0.1 and 10 ps, respectively.
Complementarity between the ligand and the binding site is the basic concept behind ligand binding. This is manifest as steric complementarity, i.e. the shape of the ligand is mirrored in the shape of the binding site, allowing molecular interactions between two molecules [31]. MD simulations allow rearrangement of the protease side chain, especially on the active site surface, which improves the interacting surface complementarities of the complex. As shown in Table 2, after 0.1 ps, the time scale that produced the best correlation coefficient, the average all-atom RMSD of the complex was only 0.35 Å, while the average all-atom RMSD of the flap region was 0.54 Å. Influence of MD simulation duration on ligand bindingLigand docking revealed a consistent set of recurring binding modes. For all MD time scales, well-clustered docking results could be obtained. Generally, the lowest binding energy clusters are associated with the lowest all-atom RMSDs of the ligands. The best results in terms of lowest binding energy are located in a similar position of the x-ray structure at the active site. Table 3 shows the number of docking solutions in a cluster (N) along with the all-atom ligand RMSD for each MD time scale. Table 3. Ligand all-atom RMSD (Å) and the number of docking solutions (N) in the cluster from 100 Larmarckian genetic algorithm (LGA) docking runs of twenty-five protease-inhibitor complexes. A small N value indicates strong specificity of binding, with all of the solutions resembling one of only a small number of binding conformations and orientations. On the other hand, if N is large, the experiment indicates a low specificity of binding, since the solutions are composed of many different binding conformations or orientations. In this study, the clustering result of ligands had an average N of 10.72 for docking without any MD, 8.36, 4.12, 5.88, and 7.16 for docking with 0.01, 0.1, 1, and 10 ps MD simulation, respectively. On average, the lower number of docking solutions in the cluster for all MD simulation time scales indicates that the ligands bind to their binding pocket with high specificity. The docked ligands after 10 ps MD simulation exhibited a wide range of RMSDs (0.87 Å to 3.86 Å) with an average RMSD of 2.52 Å, which indicates the failure of the ligands to recognize and specifically bind to the binding site with the protease flaps open. ConclusionsIn this study, we illustrate the importance of taking dynamics into account to predict the structure and energetics of protein-ligand interactions. It is clear that relaxation of HIV-1 protease for 0.1 ps MD simulations is enough for rearrangement of the surface-binding pocket to produce good correlations between calculated and experimental binding energies. The binding energies of all protease molecules bound to different inhibitors were influenced by the movement of the flap regions with the correlation coefficient decreasing as the flaps moved away from the experimental structure. The differences in these correlations may reflect biological features of the dynamics of HIV-1 protease-inhibitor complexes. Future work with larger data sets, different energy functions, different docking and binding energy evaluation methods, and more starting seeds, is necessary to determine the optimal parameters to robustly predict protein-substrate binding energies in silico. MethodsA set of twenty-five HIV-1 protease-inhibitor complexes with experimentally determined structures and binding energies chosen from the Protein Data Bank (PDB), were used in this study [32-50]. All complexes were determined by x-ray crystallography with resolution and R-factor less than 3.0 and 0.19, respectively. The PDB codes of the complexes along with the experimental binding energies are listed in Table 1. The experimental binding energies used in this study were converted from the experimental inhibition constants (Ki), using the equation: ΔG0 = -RT ln Ki where ΔG0 is the Gibbs free energy of binding (kcal/mol), R is the gas constant (1.987 cal K-1 mol-1) and T is the absolute temperature (room temperature, 300 K). Assuming that the Ki values of each complex were obtained under the quasi-equilibrium conditions of Michaelis-Menton kinetics [51,52], we expect that a plot of ΔG0 against ln Ki is a straight line, with a slope given by the thermal energy, RT = 0.6 kcal/mol (at 300 K). Since the Ki values were determined by different experimental techniques, some level of noise might exist in the underlying data. We expect our method to behave robustly in this regard. Molecular Dynamics simulationsMD simulations of all complexes were carried out with the NAMD software version 2.5b1 [53] using the X-PLOR force field [54]. Missing atoms in 1heg, 1hte and 5hvp were added by using the guesscoord command in NAMD. The van der Waals, bond, angle, dihedral, and improper dihedral parameters for all the ligands were adopted from the Hetero-compound Information Centre-Uppsala HIC-Up http://xray.bmc.uu.se/hicup webcite[55]. The water molecule under the flaps present in all complexes was included in all steps throughout this study except for 1hvr, where the structural water was removed in the preparation of the protease to be docked with the ligand (the ligand of 1hvr, which is a urea-based inhibitor, does not bind a water molecule in this position). The water molecules were added to the binding site of 1hvs because all the water molecules were absent in the experimental structure. Protein protonation states were modeled as in the related HIV-1 protease MD simulation study [56]. All protein residues were modeled in their charged state except for one of the two aspartic acid groups (Asp 25 and Asp 25') in the active site since previous studies [57-59] have shown that at least one of these two aspartic acids is protonated. We used a protonated Asp 25' and deprotonated Asp 25 for all HIV-1 protease-ligand complexes. The terminal residues of both monomers were also protonated (Pro 1, Pro 1', Phe 99 and Phe 99'). The complexes were immersed in a 20 Å radius sphere of TIP3-water using the program SOLVATE [60] to allow the protein-ligand complexes to relax in an aqueous environment. One hundred steps of energy minimization of the protein-ligand-water complex were initially performed, followed by 10 ps MD simulation at 300 K, with an atom-based shifted distance-dependent dielectric constant, ε = 4r; a switch function on van der Waals interaction, and a time step of 1 femtosecond (fs). The nonbonded interaction list was updated every 20 time steps. The van der Waals interactions were truncated at a distance of 12 Å. The simulations were repeated with five different starting seeds. The structures at 0.01 ps, 0.1 ps, 1 ps and 10 ps were recorded and processed in the docking step. Preparation of protease-inhibitor complexesTo calculate the binding energy with AutoDock, we first prepared the protein complexes by separating each snapshot recorded from MD simulations into one file containing the protease and the water molecules, and one file containing only the ligand. Polar hydrogens were then added to the protein coordinates with the PROTONATE utility from AMBER [61]. AMBER united atom force field charges were assigned, and solvation parameters were added using the ADDSOL utility. The 3D affinity grid fields were created using the auxiliary program AutoGrid. The center of protein mass was chosen as the grid center. In this stage, the protein was embedded in the 3D grid and a probe atom was placed at each grid point. The affinity and electrostatic potential grid was calculated for each type of atom in the protease molecule. The number of grid points in x, y, z-axis was 60 × 110 × 60 with grid points separated by 0.375 Å. Ligands that had a peptide-like N- or C-terminal end were assigned a charge. Hydrogen atoms were added to fill all empty valences, and Kollman united-atom charges [62] were also assigned to the ligands. Rotatable dihedrals in the ligands were assigned using the program AutoTors and were allowed to rotate freely. The nonpolar hydrogens were removed and the partial charges from these were added to the carbon that held the hydrogen. The atom type for the aromatic carbons was reassigned to be handled by the aromatic carbon grid map. These preparations were done for each ligand using the AutoTors module. Automated dockingDocking calculations were carried out using AutoDock, version 3.0.5 [4]. Three binding energy terms were taken into account in the docking step: the van der Waals interaction represented as a Lennard-Jones 12–6 dispersion/repulsion term, the hydrogen bonding represented as a directional 12–10 term, and the Coulombic electrostatic potential. Docking runs were performed using the Larmarckian genetic algorithm (LGA) [4] as previously described [63,64] with some modifications of the docking parameters. The LGA describes the relationship between the protein and the ligand by the translation, orientation, and conformation of the ligand. These "state variables" are the ligand's genotype, and the resulting atomic coordinates together with the interaction and intermolecular energies are the ligand's phenotype. The environmental adaptation of the ligand's phenotype was reverse transcribed into its genotype and became heritable traits. Docking began with a population of random ligand conformations in random orientations and at random translations. Each docking experiment was derived from 100 different runs that was set to terminate after a maximum of 1,500,000 energy evaluations or 27,000 generations, yielding 100 docked conformations. The population size was set to 50. The elitism number, the rate of gene mutation and the rate of gene crossover were 1, 0.02 and 0.8 respectively. A pseudo-Solis and Wets local search was then used to minimize energy of the population. The probability that docking solution in the population would undergo a local search was set to 0.06 and the constraint was set to a maximum of 300 iterations per search. The maximum number of successes or failures before changing the size of local search space (rho) were both set to 4. The starting conformations of the ligand were set to random positions. Translations were set to have a maximum limit of 2 Å/step and the orientation step size for the angular component and the torsions had a maximum limit at 50 degrees/step. At the end of a docking job with multiple runs, AutoDock performed cluster analysis. Docking solutions with ligand all-atom RMSDs within 1.0 Å of each other were clustered together and ranked by the lowest energy representative. The lowest-energy solution of the lowest ligand all-atom RMSD cluster was accepted as the calculated binding energy. Author's contributionsEJ performed Molecular Dynamics simulations, docking, evaluated the results, and drafted the manuscript. RS helped with evaluation of the results produced, refining the manuscript, and provided intellectual guidance and mentorship. All authors read and approved the final manuscript. AcknowledgmentsThis work was supported in part by a Searle Scholar Award to Ram Samudrala. We thank members of the Samudrala group for helpful comments. References
Have something to say? Post a comment on this article! |



on Google Scholar







author email
corresponding author email
Figure 1.
Figure 2.