Abstract
Background
Microbial hosts offer a number of unique advantages when used as production systems for both native and heterologous smallmolecules. These advantages include high selectivity and benign environmental impact; however, a principal drawback is low yield and/or productivity, which limits economic viability. Therefore a major challenge in developing a microbial production system is to maximize formation of a specific product while sustaining cell growth. Tools to rationally reconfigure microbial metabolism for these potentially conflicting objectives remain limited. Exhaustively exploring combinations of genetic modifications is both experimentally and computationally inefficient, and can become intractable when multiple gene deletions or insertions need to be considered. Alternatively, the search for desirable gene modifications may be solved heuristically as an evolutionary optimization problem. In this study, we combine a genetic algorithm and elementary mode analysis to develop an optimization framework for evolving metabolic networks with energetically favorable pathways for production of both biomass and a compound of interest.
Results
Utilization of thermodynamicallyweighted elementary modes for flux reconstruction of E. coli central metabolism revealed two clusters of EMs with respect to their ΔG_{p}°. For proof of principle testing, the algorithm was applied to ethanol and lycopene production in E. coli. The algorithm was used to optimize product formation, biomass formation, and product and biomass formation simultaneously. Predicted knockouts often matched those that have previously been implemented experimentally for improved product formation. The performance of a multiobjective genetic algorithm showed that it is better to couple the two objectives in a single objective genetic algorithm.
Conclusion
A computationally tractable framework is presented for the redesign of metabolic networks for maximal product formation combining elementary mode analysis (a form of convex analysis), pathway thermodynamics, and a genetic algorithm to optimize the production of two industriallyrelevant products, ethanol and lycopene, from E. coli. The designed algorithm can be applied to any smallscale model of cellular metabolism theoretically utilizing any substrate and applied towards the production of any product.
Background
Microorganisms are increasingly utilized to synthesize a variety of products [13], including fuels (bioalcohols [413] and biodiesels [14,15]), specialty chemicals (amino acids [1620]), therapeutic smallmolecules [2125] (antibacterials, anticancer agents, and cholesterollowering agents), and biopharmaceuticals [26] (proteins, vaccines, and virus particles). A common challenge in developing highyield cellular production systems is that organisms have evolved to optimize growth rather than the formation of a particular endproduct. In principle, this challenge could be met by reprogramming the cellular objective using genetic modifications (such gene insertions, overexpressions, or deletions). In practice, the selection of appropriate gene modification targets can be a daunting task. Biomass formation as well as product synthesis requires building block precursors and cofactors provided through the concerted actions of a large number of interconnected metabolic pathways encoded by hundreds to thousands of genes. While purely empirical attempts at genetic modifications have in some cases led to impressive success [27], these cases have provided the exceptions rather than the rule. There is now considerable evidence that substantial improvements in productivity require manipulating the activities of multiple enzymes in different parts of cellular metabolism [28]. In this respect, optimizing biosynthetic productivity will almost certainly benefit from computational modeling tools that systematically and efficiently explore the consequences of gene or enzymelevel modifications across the breadth of cellular metabolism.
Currently, there exists a variety of methods for studying metabolic networks in both quantitative and qualitative manners: flux balance analysis (FBA) [2931], ^{13}Clabeling based metabolic flux analysis (^{13}CMFA) [32], metabolic control analysis [33], elementary mode analysis (EMA) [34], extreme pathway analysis [35], cybernetic modeling [36,37], and biochemical systems theory [3840]. Many of these methods do not necessarily identify experimentally tractable metabolic engineering targets such as gene deletions. Whereas, some algorithms based on the aforementioned methods can be used to identify such targets including minimization of metabolic adjustment (MoMA) [41], regulatory on/off minimization (ROOM) [42], OptKnock [43], OptStrain [44], OptReg [45], and OptGene [46]. All six of these methods require solving an optimization problem to determine flux distributions as a means of evaluating the strain's (or mutant strain's) metabolic capabilities. Although these optimization approaches can accurately predict optimal growth and production fluxes in some cases [47], other experimental settings produce inaccurate predictions [48]. In addition, situations that require the removal of numerous genes to achieve high productivity will lead to mutant strains significantly different from wildtype systems, further weakening the assumptions behind FBA. OptKnock and OptStrain utilize a bilevel optimization for determining superior mutant strains. The mixed integer linear programming (MILP) framework used in these two algorithms optimizes for one objective within another competing one (a cellular objective (biomass production) within an engineering objective (chemical production)). However, the user must provide the number of knockouts that OptKnock and OptStrain can allow. In general, exhaustively searching genomic space for knockout candidates is computationally intractable even on smallscale metabolic models (less than 100 reactions), much less on current genomescale metabolic models (greater than 1000 reactions) due to prohibitive computation time. This situation coupled to the fact that two or three knockouts are likely not sufficient for generating a mutant capable of maximal productivity motivated the use of a genetic algorithm as demonstrated in OptGene.
Genetic algorithms (GAs) have been classically utilized as a search method for optimization of objective functions that are discontinuous, nondifferentiable, stochastic, or highly nonlinear. As the name implies, the underlying theory behind GAs is based on Darwinian evolution. GAs seek to evolve a population of potential solutions by crossover and mutation, and through multiple generations, the entire population will eventually "evolve" towards a global optimum (a "fitness score"). They have been used to some extent in modeling biological systems [46] and are used as a search technique in this study. Though OptGene utilizes a GA to efficiently explore genotypic space, the framework still requires the use of a metabolic assumption required to determine metabolic fluxes (such as FBA, MoMA, or ROOM). The goal of this study was to leverage the power of a GA without the need for a metabolic assumption. As such, a primary objective of this work was to identify a genotype with a high productivity phenotype strictly from the wildtype organism's metabolic network topology, utilizing thermodynamics.
An elementary mode (EM) is a nondecomposable set of reactions (encoded by a set of genes) that leads to a functional metabolic pathway. EMA is a method of enumeration of all of the EMs of a metabolic network. As such, an EMA presents a convex analysis problem from computational geometry, in which the extreme rays of the polyhedral cone (as defined by stoichiometry and reversibility) are the EMs of the metabolic network. As a result, an EM represents a single functional pathway within overall cellular metabolism, a linear combination of a cell's EMs can be used to describe any metabolic state achievable by the cell's stoichiometry. The algorithmic complexity of this problem has not been studied in detail and has therefore been classified as at least an NPhard (nondeterministic polynomialtime hard) problem [35]. Empirical observations have shown that the computation time of EMA algorithms grows approximately quadratically with respect to the number of EMs, and unfortunately, the number of EMs grows exponentially with respect to network size. As a result, the computation time increases greatly with respect to network size, limiting analysis to non genomescale metabolic networks. Nonetheless, EMA has been utilized to design strains of E. coli that are efficient at producing biomass from glucose [49] and ethanol from five and sixcarbon sugars [50]. In two cuttingedge applications, EMA was combined with linear programming to determine flux distributions from external measurements in lysineproducing Corynebacterium glutamicum [51], and to determine the metabolic fluxes of Lactobacillus rhamnosus growing on medium containing mixed substrates [52]. EMA has also been utilized to determine flux distributions in polyhydroxybutyrateproducing E. coli, mediated by a thermodynamic analysis of the EMs [53].
In this study, an algorithm based on EMA, pathway thermodynamics, and a GA has been constructed with the goal of redesigning a metabolic network towards maximally producing compounds of interest while simultaneously sustaining high biomass formation. This algorithm was applied to producing ethanol and heterologous lycopene through E. coli. We addressed issues of computation speed by coupling the EMA model with a GA to efficiently explore genotypic space. This algorithm presents a combination of a variety of traits that have been explored previously by themselves: 1) it is based solely on reaction stoichiometry and network topology, independent of any experimental flux measurements (although, if flux measurements are available, they can be used to constrain the problem); 2) by the utilization of a GA, it contains an efficient search method arriving at a solution within minutes on a singleprocessor notebook system; 3) by the utilization of a GA, it arrives at an optimal solution in a computationally tractable amount of time; and 4) it contains the option to use a multiobjective genetic algorithm (MOGA), only utilized very recently in analysis of metabolic networks, but still within the context of FBA [54].
Results & Discussion
Algorithm
The framework is schematically shown in Figure 1. A GA is used for the optimization, by evolving a population of potential solutions (strains) towards the global optimum solution (the theoretical yield of either product or biomass on substrate). A strain is represented by a binary vector, or a chromosome, where an entry of "1" indicates a particular reaction corresponding to the position of the entry included in the strain and a "0" indicates that the reaction is not present. Each strain is then evaluated based on a fitness criterion. The strain's EMs are then enumerated and its corresponding metabolic flux vector is generated by taking a weighted linear combination of the EMs in two ways: 1) equal weighting of the EMs, or 2) weighting the EMs based on their corresponding Gibbs free energy associated with the pathway (ΔG_{p}°). Next, the yields of biomass and product on substrate are calculated and the strain's corresponding fitness is evaluated. This fitness, a reflection of the cellular phenotype, is then used by the GA to enrich the population of potential solutions to those that have a higher fitness value. This process is repeated until one of the GA's stopping criteria is met.
Figure 1. Schematic overview of the framework. The following optimization objectives (fitness functions) were considered: (a) calculating metabolic fluxes through equalweighting of the EMs, (b) calculating metabolic fluxes through weighting the EMs based on their corresponding Gibbs free energy cost, and (c) minimizing the average Gibbs free energy cost of EMs forming both product and biomass. The free energy cost of a reaction route was calculated as the stoichiometric sum of its reaction Gibbs energies. Reaction routes with net negative energy costs are considered favorable.
Elementary Mode Analysis
As test cases, two smallscale stoichiometric models were constructed of ethanol or lycopeneproducing Escherichia coli using a previously published model as a template [50], the details of which can be seen in Table 1. The dimensions of the ethanol and lycopene models were 47 × 60 (47 metabolites and 60 reactions) and 50 × 64 (50 metabolites and 64 reactions), respectively. The ethanol model supported 33,220 EMs: 8,389 (25.3%) of which produce ethanol, 28,336 (85.3%) of which produce biomass, and 7,156 (21.5%) of which produce both ethanol and biomass. The lycopene model supported 42,659 EMs: 9,439 (22.1%) of which produce lycopene, 32,763 (76.8%) of which produce biomass, and 4,427 (10.4%) of which produce both lycopene and biomass. A scatter plot (Figure 2) of each EM's product vs. biomass yield showed a negative correlation for both models, suggesting a pattern of tradeoff between cell growth and product formation. This tradeoff was most evident for EMs with either very high biomass or product yields. The theoretical yield of ethanol on glucose is 0.511 g g^{1 }and the theoretical yield of lycopene on glucose is 0.316 g g^{1}. There are 14 (0.04%) EMs that produce the theoretical yield of ethanol on glucose (requiring between 39 and 44 reaction removals), while there is only one EM (0.002%) that produces the theoretical yield of lycopene on glucose (requiring 36 reaction removals). EMA was conducted on a notebook equipped with an Intel^{® }Core™ 2 Duo T9300 CPU running at 2.50 GHz, 4.0 GB memory, and a 32bit version of Microsoft Windows Vista™ Ultimate. The computation time required to enumerate the entire set of EMs was 8.04 s ± 0.27 s (n = 10) and 12.30 s ± 0.24 s (n = 10), for the ethanol and lycopene models, respectively. More detailed information of both models can be found in Additional File 1 and Additional File 2.
Additional file 1. Ethanol model information. A table describing the ethanol model containing reaction number, abbreviation, stoichiometry, as well as corresponding gene names and enzymes.
Format: DOC Size: 81KB Download file
This file can be viewed with: Microsoft Word Viewer
Additional file 2. Lycopene model information. A table describing the lycopene model containing reaction number, abbreviation, stoichiometry, as well as corresponding gene names and enzymes.
Format: DOC Size: 84KB Download file
This file can be viewed with: Microsoft Word Viewer
Figure 2. Product vs. biomass yield for every EM in the (a) ethanol and (b) lycopene model. The number of EMs was 33,220 and 42,659 for the ethanol and lycopene models, respectively (more information on these models can be found in Table 1). Yields are expressed as fractions of the theoretical yield of product or biomass. The theoretical yields of product or biomass on substrate were determined by searching the EMs of the templategenome (the parent strain) and finding the maximal yields for product or biomass on substrate, respectively. Each point in the scatter plot corresponds to an EM (or number of EMs) with a pair of product and biomass yield values. Several EMs were associated with the same pair of product and biomass yield values, which are represented by a single point.
Table 1. Information on the ethanol and lycopene models, their corresponding EMs, and their reaction and pathway change in Gibbs free energy.
Pathway Gibbs Free Energies
Standard Gibbs energies of formation (ΔG_{f}°) of metabolites included in the two models were calculated using group contribution theory [55] from previously reported data [56]. These values were used to estimate the Gibbs energy changes of reactions in the model. Technically, these estimates correspond to Gibbs energy changes defined for standard conditions (ΔG_{r}°) (298.15 K, 1 atm, pH 7.0, all compounds at 1 M), rather than physiological conditions. Consequently, it is quite likely these estimates deviate slightly from experimentally determined values. In this study, we used these estimates as firstorder approximations to derive the Gibbs energy changes across metabolic pathways (ΔG_{p}°) as defined by the EMs. Histograms of ΔG_{r}° and ΔG_{p}° values (Figure 3) show qualitatively different distributions. The ΔG_{r}° histogram approximates a normal distribution about zero; whereas, the ΔG_{p}° histogram clearly skews in the negative direction. The mean ΔG_{r}° values for both models are slightly negative (Δ1.95 kcal mol^{1 }and 1.77 kcal mol^{1 }for the ethanol and lycopene models, respectively). Interestingly, the skewness of both distributions is different: the ethanol model has a negative skewness of 4.54 kcal mol^{1 }while the lycopene model has a slight positive skewness of 0.42 kcal mol^{1}. This indicates a change in distribution to slightly more energetically favorable reactions with respect to the ethanol model. The mean values for ΔG_{p}° are much more negative at 111.87 kcal mol^{1 }and 88.72 kcal mol^{1 }for the ethanol and lycopene models, respectively. Interestingly, the ethanol model contains relatively few (242, only 0.73%) thermodynamically infeasible EMs (having ΔG_{p}° > 0), while the lycopene model contains both a higher number of thermodynamically infeasible EMs in both number (1487) and percentage (3.49%). This exemplifies the nature of a host engineered to produce a compound it does not normally produce, showing that a variety of the pathways are not evolved.
Figure 3. Histogram of reaction Gibbs free energies for the (a) ethanol and (b) lycopene models. Also plotted is a histogram of the EM Gibbs free energies for the (c) ethanol and (d) lycopene models. Gibbs free energies for the EMs were computed as described in the text.
Previous analyses have shown that most metabolic reactions are near equilibrium (ΔG_{r}≈ 0) [57]; whereas, metabolic pathways are energetically favorable (ΔG_{p }< 0) [56], consistent with the trends shown in the panels of Figure 3. In this regard, the calculated ΔG_{p}° values should offer qualitatively correct and quantitatively reasonable estimates of the thermodynamic favorability of metabolic pathways. The change in Gibbs free energy across an entire pathway (an EM) is much more likely to be negative than the change in Gibbs free energy across individual reactions within the pathway as shown in Figure 3c and Figure 3d. Therefore, there is a strong correlation between stoichiometric feasibility and energetic favorability at the level of the pathway. Moreover, thermodynamic favorability has already been used in the past to narrow the solution space or eliminate infeasible solutions when estimating or optimizing metabolic flux distributions [57,58]. In the algorithm, we used the ΔG_{p}° estimates to identify thermodynamically favorable reaction routes and enrich the mutant organism with these favored routes towards product and biomass production.
As stated previously, cellular metabolism (a flux vector) can be represented as a linear combination of the cell's EMs. It has been demonstrated previously that there exists a strong correlation between the standard change in entropy across an EM (ΔS_{p}°) and the weighting factor of its contribution to the overall flux state in both a wildtype E. coli strain and a strain engineered for polyhydroxybutyrate production [53]. The resulting method of determining fluxes based on weighting by ΔS_{p}° values was then compared to flux values reported in literature and showed a strong correlation (R^{2 }= 0.85). In an analogous manner, we utilized Gibbs free energies for flux determination similar to previous efforts [57].
Comparison of Calculated Fluxes with Alternative Methods
The flux profiles of the E. coli model (without the additional lycopene biosynthetic reactions) using both equalweighting of the EMs and thermodynamic weighting of the EMs were compared to fluxes calculated by FBA (Figure 4). In the linear programming approach of FBA, fluxes are determined through the utilization of an objective function which maximizes the biomass equation ("growthrate"). This is an inherently different approach in which optimization is utilized to determine the flux distribution; whereas, the approach in this algorithm only uses optimization for redesigning the cellular genotype. As such, the algorithm's fluxdetermination method makes no metabolic assumptions; instead, it assumes that the flux distributions are determined by wellgrounded thermodynamic principles, which have been shown to be accurate in previous studies [53].
Figure 4. A bar plot of fluxes for the ethanol model calculated using (a) FBA linear optimization, (b) equallyweighted EMs, and (c) thermodynamicallyweighted EMs.
Ethanol Case Study
Two case studies were performed to evaluate our algorithmic framework. In these case studies, the algorithm was tasked to identify gene knockouts that would result in the optimal product yield, biomass yield, or overall productivity (biomass yield × product yield). The two cases were: aerobic production of ethanol (a native compound) and aerobic production of lycopene (a heterologous compound), both in E. coli. The search space for the algorithm excluded reactions that computationally led to either no biomass formation or no product formation. For the ethanol case study, these reactions were BIO, FEM5, FEM6, GG1, PPP5r, TCA1, TCA2r, TCA3r, TCA4, TRA1, and TRA3 (see Additional File 1 for more information). These reactions were identified by conducting a single reaction removal analysis on the wildtype model. The reactions were removed individually, the EMs were enumerated, and the EMs for each mutant were rank ordered based on their stoichiometric ethanol yield or biomass yield. If the maximal yield for either ethanol or biomass was zero, then these reactions were considered to be necessary. As a result, any strain that contained any one of these knockouts would produce either no ethanol or no biomass.
In this case study, equal weighting of the EMs and weighting the EMs based on their corresponding Gibbs free energies led to fitness values of 1.000 for ethanol production (the fitness function described in Eq. 5) requiring 13 and 11 reaction removals, respectively. This indicates that ethanol production at the theoretical yield is thermodynamically feasible. The fitness values for the coupled ethanol and biomass production fitness function were similar, though not identical: 0.266 (equal weighting) and 0.241 (thermodynamic weighting). The individual yields of ethanol and biomass were slightly different, with the ethanol fractional yield being higher in both cases. This highlights a potential advantage of utilizing a MOGA: the array of solutions will display a range of organism phenotypes that include: 1) highproducing, slowgrowing, 2) lowproducing, fastgrowing, and 3) moderateproducing, moderategrowing. In most of the cases investigated here, a coupled fitness function (Eq. 7) leads to organisms generated by the third case. However, the potential in using a MOGA is the ability to choose what type of organism the researcher would like to construct based on ease of construction or process economics. As can be seen in Figure 5, the algorithm as formulated here is quite robust at finding the global optimum quickly, even given varying initial conditions.
Figure 5. A plot of average and mean fitness scores (± one standard deviation, n = 10) as a function of generation number for optimizing ethanol production using equallyweighted EMs.
Whenever ethanol was being optimized (either by itself or with biomass), reactions for either NADH dehydrogenase I/ATP synthase (OPM1) or NADH dehydrogenase II (OPM4r; also known as NADH:ubiquinone oxidoreductase II) were removed. Both NADH dehydrogenase I and II are involved in driving electron flow, while NADH dehydrogenase I is driven by oxygen. NADH dehydrogenase II uses NADH exclusively and is repressed when E. coli is grown anaerobically [59]. The results predicted using either EMweighting method are consistent with what is known about ethanol production through E. coli, namely, that it is mainly produced anaerobically. Also for the majority of the cases, reactions for the pyruvate oxidase (coded by poxB) and phosphate acetyltransferase (coded by pta) were identified for removal, consistent with what has been previously reported for improving ethanol production from glucose through E. coli [49]. Depending on the weighting scheme, reactions identified for removal were fumarate reductase (coded by frdABCD), malate dehydrogenase (coded by sfcA and maeB), or lactate dehydrogenase (ldhA), all also identified by the pervious study as a near optimal producing genotype.
All of the solutions generated strains with between 16 and 24 EMs (over a three order of magnitude reduction in EMs); the different fitness functions produced mutant genomes that shared a number of similarities. Across many of the solutions, reactions in oxidative phosphorylation (as previously described) were removed. Many of the fermentative acid pathways were also removed (most notably for acetate and lactate production), which would limit the formation of undesired byproducts. Relative few reactions in glycolysis, the TCA cycle, and the PPP were removed in the different weighting schemes. It is also interesting to note that the main fermentative pathway preferred by E. coli (the acetic acid pathway) and the lactic acid pathway are chosen for removal over other secreted weak acid metabolites (formate, succinate, and pyruvate). It is therefore likely that the removal of other fermentative pathways would improve flux to the ethanologenic pathway. Taken together, the general results across many of the fitness functions suggest removal of certain fermentative and anapleurotic pathways improves cellular phenotype.
As stated, many of the knockouts identified by the algorithm presented here have been reported in E. coli. The frdA and ndh knockouts were also identified by an EMAbased algorithm and implemented in the laboratory to improve ethanol production [50]. The nuo and atp operons were neglected in the previous model because the model was restricted to anaerobic action, therefore as stated previously, the algorithm here correctly identifies these as knockout targets (akin to operating anaerobically). However, the nuo knockout increases glucose uptake and ethanol production, while decreasing acetate, succinate, lactate, and formate formation in an anaerobic chemostat with complex medium supplemented with glucose [60]. While the previous algorithm identified a variety of other knockouts predicted to improve production, the algorithm design was slightly different and exemplifies one of the challenges in both engineering and modeling biological systems. As has been shown here and in other places [61,62], multiple genotypic states can lead to the same phenotypic state.
Lycopene Case Study
Next, the E. coli metabolic network was optimized for lycopene production. Lycopene is a C_{40 }carotenoid natural product with antioxidant properties. Much work has been devoted to engineering lycopene biosynthesis in E. coli [6372], due to the fact that it shares metabolic precursors (DMAPP and IPP) to other isoprenoid natural products with immense therapeutic value, such as the antimalarial sesquiterpene artemisinin and the anticancer diterpenoid paclitaxel [70].
Using the same approach as in the ethanol case study, the results were slightly different here. In general, the solutions generated strains with more EMs (between 18 and 174), perhaps due to the fact that the number of EMs in the parent network was nearly 30% larger for this model. First, the thermodynamicallyweighted cases often converged to lower values than the equalweighted cases when lycopene yield was included in the fitness function. Second, the fitness value for both the product yield and biomass yield optimization cases were suboptimal (less than one). While this was the case for biomass production in the ethanol model, it was not the case for product formation. This is likely due to the fact that the ethanol model had 14 EMs capable of producing the theoretical yield of product on glucose, while the lycopene model only had 1 EM that produced the theoretical yield. This genotype resulted in 36 reaction removals, which is well above the number of reaction removals found in the optimal solutions presented in Table 2 and Table 3. In both cases, the algorithms were run multiple times to confirm the result, and the same fitness values were achieved. This result may be due to the fact that the original problem was seeded with an initial population of individuals containing between two and six removals, which may bias solutions with fewer knockouts (which is certainly experimentally amenable). This could also be due to the severe drain of the cellular NADPH pool required for lycopene biosynthesis (the pathway for lycopene biosynthesis requires 16 molecules of NADPH consumed for 1 molecule of lycopene produced) [64].
Table 2. Information on resulting strains after running the algorithm for the ethanol model.
Table 3. (Information on resulting strains after running the algorithm for the lycopene model.
A pioneering study on applying computational methods (specifically, MoMA) for driving metabolic engineering studies focused on lycopene production from glucose minimal medium through E. coli [64]. The single knockout target search identified seven knockout targets (gdhA, cyoA, gpmAB/yjtC, ppc, glyA, eno, and aceE), two of which are not included in the model used here as they pertain to amino acid metabolic pathways (gdhA and glyA). Of the other five, all were identified in some capacity by our model when optimizing for lycopene or lycopene and biomass, some of them in multiple cases. Of these computationally identified knockouts, numerous single, double, and tripleknockout strains were constructed in the laboratory and showed improved lycopeneproducing phenotypes.
The fdhF knockout (identified here as a knockout candidate in the equal weighting cases, both with and without biomass production) improved specific lycopene production by 4%. Combining these two knockouts with a gdhA knockout (as identified by genomescale MoMA simulations, but not considered in the model presented here) resulted in the best triple knockout strain, improving specific lycopene production by 37% [64]. The nuo knockout improved specific lycopene production by 45% (from 1,100 ppm to 2,040 ppm) in complex medium supplemented with glucose. The other knockouts identified in this case have either not been reported with respect to improving lycopene production, or are lethal to the cell (as is the case with pgk). The aceE knockout was identified by MoMA simulations and implemented in the laboratory, improving specific lycopene production by 9% in minimal medium supplemented with glucose [64]. The pykAF doubleknockout improved specific lycopene production threefold (from approximately 5 to 15 mg gDCW^{1}) in complex medium [69]. While many of these knockouts have not been conducted in the same strain, there remain many opportunities to improve lycopene titers. Currently, lycopene production yields reported are well below the theoretical yield on glucose (316 mg/g glucose). For example, bioreactor cultivation of the overproducing ΔgdhA ΔaceE ΔfdhF triple knockout strain resulted in a lycopene yield of 2.15 mg g glucose^{1 }[66], less than 1% of the theoretical yield. It is reasonable to assume that numerous additional knockouts would further aid efforts to reach this theoretical yield. Overall, the reported literature on metabolic engineering effort to improve lycopene production in E. coli strongly supports the validity of the algorithm developed here.
Multiobjective Genetic Algorithm
Given the dual objective nature of the system in question (product yield and biomass yield), it would be logical to also assess the performance of a multiobjective genetic algorithm (MOGA). MOGA maximizes/minimizes a vector of objective functions (in this case, a vector of length two) rather than a scalar objective, as was the case for the GA. As a result, there is no single, unique solution to this problem. Instead of identifying a single solution, a MOGA aims to identify a set of solutions in which an improvement in one objective requires a decrease in the other. Each solution is considered to be a noninferior solution and the entire set of noninferior solutions is referred to as the Pareto optima. The MOGA invoked here uses a controlled elitist genetic algorithm, a variant of the Nondominated Sorting Genetic AlgorithmII (NSGAII).
The multiobjective version of the algorithm was tested on the ethanol model, with both equal and thermodynamicweighting of the EMs. In each case, of the entire final population, only a small fraction of the individuals (<10%) had noninferior solutions. In the case for equalweighting of the EMs, the best individuals had fitness values of 0.3475 and 0.4375 (for Eq. 10 and Eq. 11, respectively), which has a product of 0.1520 and is much lower than the value of approximately 0.25 found in the solution of the singleobjective GA with the coupled fitness function (Eq. 7). Similarly, while thermodynamicallyweighting the EMs, the best individual in the Pareto optima had fitness values of 0.6885 and 0.2112 and a product of 0.1454, also much lower than the value of approximately 0.26 found in the solution of the singleobjective GA with the coupled fitness function. As stated previously, an advantage of the MOGA is that it allows the user to "choose" whether to pursue constructing a strain predicted to have slightly lower growth rate but higher product yield versus a strain predicted to have a slightly higher growth rate but lower product yield (as can be seen in the scatter plots in Figure 6). However, in this study, the suboptimal values of the fitness functions and increased computational times place the MOGA approach at a disadvantage when compared to the singleobjective GA.
Figure 6. A plot of the Pareto optima and the fitness values for the corresponding noninferior solutions for the ethanol model using (a) equallyweighted EMs and (b) thermodynamicallyweighted EMs.
Though MOGAs have not been used for optimizing the structure of metabolic networks, there has been a recently reported example of using one for optimizing an industrial bioprocess (penicillin V production from Penicillium chrysogenum) [73]. In particular, a MOGA was used for 1) maximizing penicillin titer and maximizing penicillin yield from substrate, 2) maximizing penicillin titer and minimizing fermentation time, among other decision variables. While all of these were optimizing for two objectives, the authors invoked a triobjective GA yield for simultaneously optimizing penicillin titer, penicillin yield, and profit.
Conclusions
This article presents the development and application of a computationally tractable framework which combines elementary mode analysis, pathway thermodynamics, and a genetic algorithm. The framework was then used to efficiently redesign the E. coli metabolic network for maximal production of two industriallyrelevant products, ethanol and lycopene. Our results show that E. coli metabolism can be retailored quite efficiently for optimal or nearoptimal production of a product of interest (ethanol or lycopene were examples here), biomass, or coupled product and biomass. As discussed, many of the gene knockouts identified by the algorithm to improve production formation have been tested experimentally (however, most often individually and not in combination) and have been shown to improve product formation rates.
It has been shown that the contribution of an individual EM to overall cellular metabolism can be estimated from its pathway thermodynamics [53]. It has been proposed that this is a result of billions of years of evolution underlying the metabolic regulation and expression patterns of the genes within these pathways. As a result of this proposal, it can be assumed that a cell will attempt to reduce its overall free energy by favoring pathways (EMs) that have a more negative Gibbs free energy. Equivalently, pathways with a positive free energy are thermodynamically infeasible and are not assigned a weight in the analysis presented here (for the case of thermodynamic weighting). This allows flux determination based solely on reaction stoichiometry and thermodynamics from the EMs generated by EMA, rather than applying a metabolic assumption (maximizing growth rate) in optimization based studies (such as FBA). It is important to note that these weighting factors are not strictly predetermined but are determined within the context of the overall cellular network.
Generally speaking, equalweighting of the EMs was shown as a proofofprinciple demonstration of the algorithm. At the same time, this was used as a reference to determine whether certain gene knockouts were predicted under both weighting schemes. Ideally, a significant fraction of the gene knockouts identified would be consistent between equal and thermodynamic weighting of the modes. As a result, the incorporation of thermodynamic calculations was an integral part of this algorithm providing for more accurate flux distributions (as compared to FBAcalculated fluxes).
The utilization of a GA to search the solution space enables the identification of an optimal genotype in a computationally tractable amount of time. The number of reaction removals required to meet these predicted optimal values are well above what is computationally feasible through exhaustive searching. For example, even ten reaction removals (the smallest number for the ethanol case study) would require evaluating 2.74 × 10^{17 }in silico organisms. With the genetic algorithm, the simulations here converged when evaluating only 2,500 in silico organisms (50 generations of 50 individuals).
Metabolic and genetic networks are highly connected with significant regulation across scales, even for microbial systems. A clear disadvantage of the model and algorithm presented, as well as most of stoichiometric modeling, is the lack of integrated regulatory information. Because these models are used to study steadystate behavior, the dynamic regulation of these systems is neglected. There have been efforts to reconstruct genomescale transcriptional and translational (TRTR) networks and transcriptional regulatory networks (TRNs) [74,75]; however, the integration of these models with metabolic models has been somewhat limited [7679]. Utilizing EMA for identifying knockout targets for improving ethanol production in E. coli allowed for simultaneous utilization of pentoses and hexoses in batch culture [50]. This shows that a strictly stoichiometric analysis using EMA can synthetically deregulate catabolite repression (perhaps the most wellstudied means of metabolic regulation).
A potential limitation of this method is the utilization of EMA, which is computationally intensive and currently cannot be applied to genomescale metabolic networks. As cited previously, the computation time of EMA algorithms grows approximately quadratically with respect to the number of EMs and the number of EMs grows exponentially with respect to network size. For example, an E. coli model of 110 reactions (28 of which were reversible) using any combination of glucose, succinate, glycerol, and acetate contained 507,632 EMs [80]. However, when making many smallmolecule products through E. coli, minimal medium with a single carbonsource is often used such that many of the reactions in E. coli metabolism would not acquire flux. Therefore, the engineering of highflux pathways (glycolysis, the TCA cycle, etc.), as represented in this smallscale model, would have more impact on product formation. Very recently, the concept of elementary flux patterns was introduced, where an elementary flux pattern is defined as a set of reactions within a subsystem of a larger network that represents the basic routes of each steadystate flux of the larger network through the subnetwork [81]. They are computed using MILP and as a result, this technique can be applied to genomescale networks, a quality mediated by the fact that computation time climbs only polynomially with respect to network size. Also very recently, an algorithm was developed to identify the Kshortest EMs within a genomescale metabolic network utilizing integer linear programming [82]. The algorithm here could be similarly applied to these two recently developed algorithms.
Methods
Model Construction
The two smallscale E. coli stoichiometric models utilized in this study were based on one previously developed [50]. Briefly, because the previous model was developed for the utilization of multiple five and sixcarbon sugars, all of the carbonsource utilization reactions besides the glucose utilization reaction were removed; glucose was assumed to be actively uptaken by the phosphoenolpyruvate sugar transferase system. In the original model, an additional reaction was included due to a heterologous pyruvate decarboxylase from Zymomonas mobilis; this reaction is not native to E. coli and was therefore also removed.
For the lycopene case study, the ethanol model previously described served as a basis with four additional reactions added. Lycopene biosynthesis was introduced into the model and coupled to the nonmevalonate pathway (native to E. coli) previously used to support heterologous carotenoid production [83,84]. Whenever possible, linear pathways were combined into a single reaction to reduce the size of the model. The first reaction (encoded by dxs and ispCDEFGH) held the stoichiometry: glyceraldehyde3phosphate + pyruvate + 2 NADPH + ATP → dimethylallyl diphosphate (DMAPP) + CO2 + 2 NADP^{+ }+ ADP. The second reaction was for the reversible isomerization of DMAPP and isopentyl diphosphate (IPP), encoded by idi. The third reaction held the stoichiometry 4 IPP → geranylgeranyl diphosphate (GGPP) and is encoded by ispA and crtE. The last reaction was for lycopene biosynthesis and held the stoichiometry 2 GGPP + 8 NADPH → lycopene + 8 NADP^{+}, and is encoded by crtBI. To avoid the inclusion of a specific transport reaction, lycopene was not balanced in this reaction; however, it was taken into account for thermodynamic calculations. The final model included three more metabolites and four more reactions than the ethanol model.
Elementary Mode Enumeration
Elementary mode analysis (EMA) was undertaken utilizing the bit pattern tree method [85]. Developed recently, this algorithm is capable of enumerating 2,450,787 EMs over tentimes faster (on a fourthread system) than the latest release of METATOOL [86], and is therefore currently the fastest method for EM enumeration [85]. The mathematical rigor associated with the bit pattern tree method and other EMA algorithms has been described previously [8688]. The code was acquired from Professor Jörg Stelling's website http://www.csb.ethz.ch/tools/efmtool webcite and interfaced with The MathWorks™ MATLAB software (version 7.6.0.324).
Pathway Gibbs Free Energy Calculations
The group contribution method of Mavrovouniotis was used in this study to estimate the standard Gibbs free energy of reaction for all of the model reactions [55]. Briefly, the group contribution method estimates the ΔG_{f}° of metabolites by decomposing a single molecular structure into a subset of smaller functional groups, each individually contributing to overall ΔG_{f}° values. The ΔG_{r}° is then known as a result of the known stoichiometry of the reaction in question. Although currency metabolites were not included in the stoichiometric model, they were accounted for in the Gibbs free energy calculations to ensure consistency with reported data. All of the metabolites used in the stoichiometric models utilized here had corresponding ΔG_{f}° values reported recently [56].
Genetic Algorithm
Chromosomal representation of the metabolic genotype for passing to the genetic algorithm is binary in nature where a "1" indicates the reaction is included in the individual and "0" indicates that the reaction is not present. For simplicity's sake, a onetoone association between reactions in the network and genes in the GA's population was assumed. This onetoone association decreases computation time by utilizing fewer variables for optimization. This onetoone association does not present a significant problem experimentally, for the geneassociations with the enzymes catalyzing the reactions are wellknown for E. coli due to the organism's biochemical knowledge and sequenced genome [8991]. A binary vector of length n therefore represents a single individual in the GA population.
Initialization of a population is a critical step for determining the success of the algorithm to find the global optimum. An initial population of fifty individuals containing between two and six knockouts was seeded to the algorithm (using MATLAB's "randerr" function). This was arrived at empirically as randomly seeding individuals with approximately 50% 0's resulted in mostly nonviable strains and did not allow for the GA to reach the optimal solution. Next, each individual in the population is evaluated and given a fitness score. A previous study on using GAs to optimize genotypic space for succinate, glycerol, and vanillin production used product flux determined by optimization (FBA and MoMA) as a scoring function [46]. As stated before, this approach relies on assumptions that may or may not be valid. Here, EMA was used as the method for scoring the individuals with fitness functions as described below.
Genetic algorithms use crossover of the chromosomes (mixing of two individuals in a population to create a new individual) and mutation (change a "0" to "1" and viceversa with a specified frequency) to evolve the solution population. The implementation here was interfaced with The MathWorks™ MATLAB software and its Genetic Algorithm & Direct Search Toolbox. For crossover, mutation, and selection of individuals, twopoint, uniform, and tournamentbased methods were used, respectively. These parameters were not optimized in this study. As stated, the population size was chosen as fifty individuals, with five of the top performing individuals automatically passed to the next generation of the GA. The selection function used in the GA was either roulette or tournamentbased. The GA always terminated as a result of being below the tolerance (of the MATLAB default, 10^{6}) which was always between 50 and 100 generations.
As a method to reduce the computation time of the GA optimization, the GA was forced to always include (through fixed inclusion of a "1" in the individual genotype) reactions that were determined to either 1) reduce maximal product yield to zero, or 2) reduce maximal biomass yield to zero (indicating a lethal knockout). This reduced the genotypic space from 60 to 49 variables in the ethanol case study and 64 to 52 variables in the lycopene case study.
Flux Determination & Fitness Function Selection
For the ethanol and lycopene casestudies, three fitness functions were examined utilizing both equalweighting of the EMs as well as thermodynamicallyweighted EMs. The flux vector can be recreated by taking the linear algebra innerproduct of the EM matrix, M, with a weightingvector, c. Here, n is the number of EMs.
The differences in the two methods are in how the weighting vector, c, is determined. For the equalweighting method:
For the case in which the EMs are weighted by thermodynamic calculations, the ΔG_{p}° values must be calculated from the ΔG_{r}° values:
Next, because it was previously determined that there existed a logarithmic relationship between the weighting factor of a particular EM and its contribution to the overall flux distribution with the change in entropy of the pathway, the weighting factor vector, c, is calculated with the following relationship:
Here, the T represents for temperature, which was taken to be 310.15K (37°C, the optimal temperature for E. coli growth). To satisfy the constraint that the sum of the weighting vector must be equal to unity, the weighting vector is then divided by the sum of the weighting vectors.
Three different fitness functions with different goals were examined. The fitness functions corresponding to Eq. 5, Eq. 6, and Eq. 7 are all nondimensionalized to unity by dividing by the theoretical product of biomass yields on substrate. However, achieving the theoretical yield of both a product and biomass on a particular substrate is impossible. Equation 5 was used to optimize a network structure for yield of product, P, on a substrate, S (in this case, glucose).
Equation 6 was used to optimize for the biomass (X) yield on substrate:
While both of the first two fitness functions are relevant for testing the functionality of the model here, a metabolic network that produces biomass and no product, or viceversa, is not desirable. To optimize both biomass and product formation, a new fitness function was created as the product of Eqs. 5 and 6, which equally weights both biomass yield and product yield. In taking this step, the case where a cell contains a metabolic network incapable of producing either product or biomass is prevented:
The computation time for these GA simulations were between 520 min on a notebook equipped with an Intel^{® }Core™ 2 Duo T9300 CPU running at 2.50 GHz, 4.0 GB memory, and a 32bit version of Microsoft Windows Vista™ Ultimate.
Flux Balance Analysis
Flux balance analysis is a linear programming method in which metabolic fluxes are determined by optimizing for biomass formation (maximizing growth rate) [92]. This was accomplished utilizing the "linprog" MATLAB function on the ethanol 47 × 60 stoichiometric matrix. For reversible reactions, a lower flux limit of 10 (arbitrary units) was used, while for irreversible reactions, a lower limit of 0 was used. For both reversible and irreversible reactions, the upper limit was chosen to be 10. The glucose uptake rate was fixed to 1 so as to scale the fluxes to glucose uptake rate and compare to the fluxes determined through weighting of the EMs. The general problem is posed as the following:
Here, S is the stoichiometric matrix as described as previously and v is the flux vector.
Maximize: z = c^{T}v
In this optimization framework, c is a row vector containing weighting factors for individual fluxes on the objective function, z. For FBA calculations, this objective is solely the biomass reaction flux. a_{i }and b_{i }are the lower and upper bounds, respectively, of each flux as determined by either thermodynamics or experimental measurements.
Multiobjective Genetic Algorithm
The MOGA invoked the same crossover, mutation, and selection algorithms as in the singleobjective GA. Here, 200 individuals were used per population and the initial population was seeded randomly using between two and six removed reactions. The MOGA was run on the ethanol model using both equalweighted and thermodynamicweighted EMs subject to the two following fitness functions (those from Eq. 5 and Eq. 6):
The computation time for these MOGA simulations was much greater than the singleobjective GA simulations, as expected. These simulations generally terminated after approximately 48 hours running on the same computer system described above.
Abbreviations
ΔG_{f}°: standard change in Gibbs free energy of formation; ΔG_{p}°: standard change in Gibbs free energy across a pathway/EM; ΔG_{r}°: standard change in Gibbs free energy across a reaction; ΔS_{p}°: standard change in entropy across a pathway/EM; EM: elementary mode; EMA: elementary mode analysis; FBA: flux balance analysis; GA: genetic algorithm; MFA: metabolic flux analysis; MOGA: multiobjective genetic algorithm; MoMA: minimization of metabolic adjustment; NSGAII: nondominated sorting genetic algorithmII; ROOM: regulatory on/off minimization; TRN: transcriptional regulatory network; TRTR: transcriptional and translational.
Authors' contributions
BAB conceived of the study, participated in its design, performed simulations, and helped draft the manuscript. HS calculated thermodynamic values. KL conceived of the study, participated in its design, and helped draft the manuscript. BAP participated in the study's design and helped draft the manuscript. All authors read and approved the final manuscript.
Acknowledgements
BAB and BAP would like to thank the Tufts University Faculty Research Awards Committee for support. BAB would also like to thank Ryan Nolan and Mark Walker for their helpful discussions.
References

Adrio JL, Demain AL: Genetic improvement of processes yielding microbial products.
FEMS Microbiol Rev 2006, 30:187214. PubMed Abstract  Publisher Full Text

Demain AL, Adrio JL: Strain improvement for production of pharmaceuticals and other microbial metabolites by fermentation.

Demain AL, Adrio JL: Contributions of microorganisms to industrial biology.
Mol Biotechnol 2008, 38:4155. PubMed Abstract  Publisher Full Text

Alper H, Stephanopoulos G: Engineering for biofuels: exploiting innate microbial capacity or importing biosynthetic potential?
Nat Rev Microbiol 2009, 7:715723. PubMed Abstract  Publisher Full Text

Atsumi S, Hanai T, Liao JC: Nonfermentative pathways for synthesis of branchedchain higher alcohols as biofuels.
Nature 2008, 451:8689. PubMed Abstract  Publisher Full Text

Atsumi S, Liao JC: Metabolic engineering for advanced biofuels production from Escherichia coli.
Curr Opin Biotechnol 2008, 19:414419. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Connor MR, Liao JC: Microbial production of advanced transportation fuels in nonnatural hosts.
Curr Opin Biotechnol 2009, 20:307315. PubMed Abstract  Publisher Full Text

Keasling JD, Chou H: Metabolic engineering delivers nextgeneration biofuels.
Nat Biotechnol 2008, 26:298299. PubMed Abstract  Publisher Full Text

Lynd LR, Laser MS, Bransby D, Dale BE, Davison B, Hamilton R, Himmel M, Keller M, McMillan JD, Sheehan J, Wyman CE: How biotech can transform biofuels.
Nat Biotechnol 2008, 26:169172. PubMed Abstract  Publisher Full Text

Mukhopadhyay A, Redding AM, Rutherford BJ, Keasling JD: Importance of systems biology in engineering microbes for biofuel production.
Curr Opin Biotechnol 2008, 19:228234. PubMed Abstract  Publisher Full Text

Savage DF, Way J, Silver PA: Defossiling fuel: how synthetic biology can transform biofuel production.
ACS Chem Biol 2008, 3:1316. PubMed Abstract  Publisher Full Text

Stephanopoulos G: Challenges in engineering microbes for biofuels production.
Science 2007, 315:801804. PubMed Abstract  Publisher Full Text

Wackett LP: Biomass to fuels via microbial transformations.
Curr Opin Chem Biol 2008, 12:187193. PubMed Abstract  Publisher Full Text

Lu X, Vora H, Khosla C: Overproduction of free fatty acids in E. coli: implications for biodiesel production.
Metab Eng 2008, 10:333339. PubMed Abstract  Publisher Full Text

Angermayr SA, Hellingwerf KJ, Lindblad P, de Mattos MJ: Energy biotechnology with cyanobacteria.
Curr Opin Biotechnol 2009, 20:257263. PubMed Abstract  Publisher Full Text

Park JH, Lee SY: Towards systems metabolic engineering of microorganisms for amino acid production.
Curr Opin Biotechnol 2008, 19:454460. PubMed Abstract  Publisher Full Text

Koffas M, Stephanopoulos G: Strain improvement by metabolic engineering: lysine production as a case study for systems biology.
Curr Opin Biotechnol 2005, 16:361366. PubMed Abstract  Publisher Full Text

Lee JH, Sung BH, Kim MS, Blattner FR, Yoon BH, Kim JH, Kim SC: Metabolic engineering of a reducedgenome strain of Escherichia coli for Lthreonine production.
Microb Cell Fact 2009, 8:2. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Lee KH, Park JH, Kim TY, Kim HU, Lee SY: Systems metabolic engineering of Escherichia coli for Lthreonine production.
Mol Syst Biol 2007, 3:149. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Park SD, Lee JY, Sim SY, Kim Y, Lee HS: Characteristics of methionine production by an engineered Corynebacterium glutamicum strain.
Metab Eng 2007, 9:327336. PubMed Abstract  Publisher Full Text

Clardy J, Fischbach MA, Walsh CT: New antibiotics from bacterial natural products.
Nat Biotechnol 2006, 24:15411550. PubMed Abstract  Publisher Full Text

Cragg GM, Newman DJ, Snader KM: Natural products in drug discovery and development.
J Nat Prod 1997, 60:5260. PubMed Abstract  Publisher Full Text

Demain AL: From natural products discovery to commercialization: a success story.
J Ind Microbiol Biotechnol 2006, 33:486495. PubMed Abstract  Publisher Full Text

Demain AL: Antibiotics: Natural products essential to human health.
Med Res Rev 2009, 29:821842. PubMed Abstract  Publisher Full Text

Paterson I, Anderson EA: Chemistry. The renaissance of natural products as drug candidates.
Science 2005, 310:451453. PubMed Abstract  Publisher Full Text

Demain AL, Vaishnav P: Production of recombinant proteins by microbes and higher organisms.
Biotechnol Adv 2009, 27:297306. PubMed Abstract  Publisher Full Text

Jantama K, Zhang X, Moore JC, Shanmugam KT, Svoronos SA, Ingram LO: Eliminating side products and increasing succinate yields in engineered strains of Escherichia coli C.
Biotechnol Bioeng 2008, 101:881893. PubMed Abstract  Publisher Full Text

Alper H, Stephanopoulos G: Global transcription machinery engineering: A new approach for improving cellular phenotype.
Metab Eng 2007, 9:258267. PubMed Abstract  Publisher Full Text

Varma A, Boesch BW, Palsson BO: Stoichiometric interpretation of Escherichia coli glucose catabolism under various oxygenation rates.
Appl Environ Microbiol 1993, 59:24652473. PubMed Abstract  PubMed Central Full Text

Varma A, Palsson BO: Stoichiometric flux balance models quantitatively predict growth and metabolic byproduct secretion in wildtype Escherichia coli W3110.
Appl Environ Microbiol 1994, 60:37243731. PubMed Abstract  PubMed Central Full Text

Varma A, Palsson BO: Predictions for oxygen supply control to enhance population stability of engineered production strains.
Biotechnol Bioeng 1994, 43:275285. PubMed Abstract  Publisher Full Text

Stephanopoulos G: Metabolic fluxes and metabolic engineering.
Metab Eng 1999, 1:111. PubMed Abstract  Publisher Full Text

Fell DA: Metabolic control analysis: a survey of its theoretical and experimental development.
Biochem J 1992, 286(Pt 2):313330. PubMed Abstract  PubMed Central Full Text

Schuster S, Fell DA, Dandekar T: A general definition of metabolic pathways useful for systematic organization and analysis of complex metabolic networks.
Nat Biotechnol 2000, 18:326332. PubMed Abstract  Publisher Full Text

Papin JA, Stelling J, Price ND, Klamt S, Schuster S, Palsson BO: Comparison of networkbased pathway analysis methods.
Trends Biotechnol 2004, 22:400405. PubMed Abstract  Publisher Full Text

Varner J, Ramkrishna D: Metabolic engineering from a cybernetic perspective. 2. Qualitative investigation of nodal architechtures and their response to genetic perturbation.
Biotechnol Prog 1999, 15:426438. PubMed Abstract  Publisher Full Text

Varner J, Ramkrishna D: Metabolic engineering from a cybernetic perspective. 1. Theoretical preliminaries.
Biotechnol Prog 1999, 15:407425. PubMed Abstract  Publisher Full Text

Savageau MA: Biochemical systems analysis. 3. Dynamic solutions using a powerlaw approximation.
J Theor Biol 1970, 26:215226. PubMed Abstract  Publisher Full Text

Savageau MA: Biochemical systems analysis. I. Some mathematical properties of the rate law for the component enzymatic reactions.
J Theor Biol 1969, 25:365369. PubMed Abstract  Publisher Full Text

Savageau MA: Biochemical systems analysis. II. The steadystate solutions for an npool system using a powerlaw approximation.
J Theor Biol 1969, 25:370379. PubMed Abstract  Publisher Full Text

Segre D, Vitkup D, Church GM: Analysis of optimality in natural and perturbed metabolic networks.
Proc Natl Acad Sci USA 2002, 99:1511215117. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Shlomi T, Berkman O, Ruppin E: Regulatory on/off minimization of metabolic flux changes after genetic perturbations.
Proc Natl Acad Sci USA 2005, 102:76957700. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Burgard AP, Pharkya P, Maranas CD: Optknock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization.
Biotechnol Bioeng 2003, 84:647657. PubMed Abstract  Publisher Full Text

Pharkya P, Burgard AP, Maranas CD: OptStrain: a computational framework for redesign of microbial production systems.
Genome Res 2004, 14:23672376. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Pharkya P, Maranas CD: An optimization framework for identifying reaction activation/inhibition or elimination candidates for overproduction in microbial systems.
Metab Eng 2006, 8:113. PubMed Abstract  Publisher Full Text

Patil KR, Rocha I, Forster J, Nielsen J: Evolutionary programming as a platform for in silico metabolic engineering.
BMC Bioinformatics 2005, 6:308. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Edwards JS, Ibarra RU, Palsson BO: In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data.
Nat Biotechnol 2001, 19:125130. PubMed Abstract  Publisher Full Text

Schuetz R, Kuepfer L, Sauer U: Systematic evaluation of objective functions for predicting intracellular fluxes in Escherichia coli.
Mol Syst Biol 2007, 3:119. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Trinh CT, Carlson R, Wlaschin A, Srienc F: Design, construction and performance of the most efficient biomass producing E. coli bacterium.
Metab Eng 2006, 8:628638. PubMed Abstract  Publisher Full Text

Trinh CT, Unrean P, Srienc F: Minimal Escherichia coli cell for the most efficient production of ethanol from hexoses and pentoses.
Appl Environ Microbiol 2008, 74:36343643. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Gayen K, Venkatesh KV: Analysis of optimal phenotypic space using elementary modes as applied to Corynebacterium glutamicum.
BMC Bioinformatics 2006, 7:445. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Gayen K, Gupta M, Venkatesh KV: Elementary mode analysis to study the preculturing effect on the metabolic state of Lactobacillus rhamnosus during growth on mixed substrates.
In Silico Biol 2007, 7:123139. PubMed Abstract  Publisher Full Text

Wlaschin AP, Trinh CT, Carlson R, Srienc F: The fractional contributions of elementary modes to the metabolism of Escherichia coli and their estimation from reaction entropies.
Metab Eng 2006, 8:338352. PubMed Abstract  Publisher Full Text

Oh YG, Lee DY, Lee SY, Park S: Multiobjective flux balancing using the NISE method for metabolic network analysis.
Biotechnol Prog 2009, 25:9991008. PubMed Abstract  Publisher Full Text

Mavrovouniotis ML: Estimation of standard Gibbs energy changes of biotransformations.
J Biol Chem 1991, 266:1444014445. PubMed Abstract  Publisher Full Text

Jankowski MD, Henry CS, Broadbelt LJ, Hatzimanikatis V: Group contribution method for thermodynamic analysis of complex metabolic networks.
Biophys J 2008, 95:14871499. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Nolan RP, Fenley AP, Lee K: Identification of distributed metabolic objectives in the hypermetabolic liver by flux and energy balance analysis.
Metab Eng 2006, 8:3045. PubMed Abstract  Publisher Full Text

Beard DA, Liang SD, Qian H: Energy balance for analysis of complex metabolic networks.
Biophys J 2002, 83:7986. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Spiro S, Roberts RE, Guest JR: FNRdependent repression of the ndh gene of Escherichia coli and metal ion requirement for FNRregulated gene expression.
Mol Microbiol 1989, 3:601608. PubMed Abstract  Publisher Full Text

Yang YT, Bennett GN, San KY: Effect of inactivation of nuo and ackApta on redistribution of metabolic fluxes in Escherichia coli.
Biotechnol Bioeng 1999, 65:291297. PubMed Abstract  Publisher Full Text

Lee S, Phalakornkule C, Domach MM, Grossmann IE: Recursive MILP model for finding all the alternate optima in LP models for metabolic networks.

Reed JL, Palsson BO: Genomescale in silico models of E. coli have multiple equivalent phenotypic states: assessment of correlated reaction subsets that comprise network states.
Genome Res 2004, 14:17971805. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Alper H, Miyaoku K, Stephanopoulos G: Construction of lycopeneoverproducing E. coli strains by combining systematic and combinatorial gene knockout targets.
Nat Biotechnol 2005, 23:612616. PubMed Abstract  Publisher Full Text

Alper H, Jin YS, Moxley JF, Stephanopoulos G: Identifying gene targets for the metabolic engineering of lycopene biosynthesis in Escherichia coli.
Metab Eng 2005, 7:155164. PubMed Abstract  Publisher Full Text

Alper H, Fischer C, Nevoigt E, Stephanopoulos G: Tuning genetic control through promoter engineering.
Proc Natl Acad Sci USA 2005, 102:1267812683. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Alper H, Miyaoku K, Stephanopoulos G: Characterization of lycopeneoverproducing E. coli strains in high cell density fermentations.
Appl Microbiol Biotechnol 2006, 72:968974. PubMed Abstract  Publisher Full Text

Yoon SH, Kim JE, Lee SH, Park HM, Choi MS, Kim JY, Lee SH, Shin YC, Keasling JD, Kim SW: Engineering the lycopene synthetic pathway in E. coli by comparison of the carotenoid genes of Pantoea agglomerans and Pantoea ananatis.
Appl Microbiol Biotechnol 2007, 74:131139. PubMed Abstract  Publisher Full Text

Jin YS, Stephanopoulos G: Multidimensional gene target search for improving lycopene biosynthesis in Escherichia coli.
Metab Eng 2007, 9:337347. PubMed Abstract  Publisher Full Text

Farmer WR, Liao JC: Precursor balancing for metabolic engineering of lycopene production in Escherichia coli.
Biotechnol Prog 2001, 17:5761. PubMed Abstract  Publisher Full Text

KleinMarcuschamer D, Ajikumar PK, Stephanopoulos G: Engineering microbial cell factories for biosynthesis of isoprenoid molecules: beyond lycopene.
Trends Biotechnol 2007, 25:417424. PubMed Abstract  Publisher Full Text

Alper H, Stephanopoulos G: Uncovering the gene knockout landscape for improved lycopene production in E. coli.
Appl Microbiol Biotechnol 2008, 78:801810. PubMed Abstract  Publisher Full Text

Yoon KW, Doo EH, Kim SW, Park JB: In situ recovery of lycopene during biosynthesis with recombinant Escherichia coli.
J Biotechnol 2008, 135:291294. PubMed Abstract  Publisher Full Text

Lee FC, Rangaiah GP, Ray AK: Multiobjective optimization of an industrial penicillin V bioreactor train using nondominated sorting genetic algorithm.
Biotechnol Bioeng 2007, 98:586598. PubMed Abstract  Publisher Full Text

Feist AM, Herrgard MJ, Thiele I, Reed JL, Palsson BO: Reconstruction of biochemical networks in microorganisms.
Nat Rev Microbiol 2009, 7:129143. PubMed Abstract  Publisher Full Text

Cho BK, Charusanti P, Herrgard MJ, Palsson BO: Microbial regulatory and metabolic networks.
Curr Opin Biotechnol 2007, 18:360364. PubMed Abstract  Publisher Full Text

Covert MW, Knight EM, Reed JL, Herrgard MJ, Palsson BO: Integrating highthroughput and computational data elucidates bacterial networks.
Nature 2004, 429:9296. PubMed Abstract  Publisher Full Text

Covert MW, Xiao N, Chen TJ, Karr JR: Integrating metabolic, transcriptional regulatory and signal transduction models in Escherichia coli.
Bioinformatics 2008, 24:20442050. PubMed Abstract  Publisher Full Text

Covert MW, Schilling CH, Palsson B: Regulation of gene expression in flux balance models of metabolism.
J Theor Biol 2001, 213:7388. PubMed Abstract  Publisher Full Text

Cox SJ, Shalel Levanon S, Bennett GN, San KY: Genetically constrained metabolic flux analysis.
Metab Eng 2005, 7:445456. PubMed Abstract  Publisher Full Text

Gagneur J, Klamt S: Computation of elementary modes: a unifying framework and the new binary approach.
BMC Bioinformatics 2004, 5:175. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Kaleta C, de Figueiredo LF, Schuster S: Can the whole be less than the sum of its parts Pathway analysis in genomescale metabolic networks using elementary ? Flux patterns.
Genome Res 2009, 19:18721883. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

de Figueiredo LF, Podhorski A, Rubio A, Kaleta C, Beasley JE, Schuster S, Planes FJ: Computing the shortest elementary flux modes in genomescale metabolic networks.
Bioinformatics 2009, 25:31583165. PubMed Abstract  Publisher Full Text

Das A, Yoon SH, Lee SH, Kim JY, Oh DK, Kim SW: An update on microbial carotenoid production: application of recent metabolic engineering tools.
Appl Microbiol Biotechnol 2007, 77:505512. PubMed Abstract  Publisher Full Text

Yuan LZ, Rouviere PE, Larossa RA, Suh W: Chromosomal promoter replacement of the isoprenoid pathway for enhancing carotenoid production in E. coli.
Metab Eng 2006, 8:7990. PubMed Abstract  Publisher Full Text

Terzer M, Stelling J: Large scale computation of elementary flux modes with bit pattern trees.
Bioinformatics 2008, 24:22292235. PubMed Abstract  Publisher Full Text

von Kamp A, Schuster S: Metatool 5.0: fast and flexible elementary modes analysis.
Bioinformatics 2006, 22:19301931. PubMed Abstract  Publisher Full Text

Urbanczik R, Wagner C: An improved algorithm for stoichiometric network analysis: theory and applications.
Bioinformatics 2005, 21:12031210. PubMed Abstract  Publisher Full Text

Haus UU, Klamt S, Stephen T: Computing knockout strategies in metabolic networks.
J Comput Biol 2008, 15:259268. PubMed Abstract  Publisher Full Text

Blattner FR, Plunkett G, Bloch CA, Perna NT, Burland V, Riley M, ColladoVides J, Glasner JD, Rode CK, Mayhew GF, et al.: The complete genome sequence of Escherichia coli K12.
Science 1997, 277:14531474. PubMed Abstract  Publisher Full Text

Durfee T, Nelson R, Baldwin S, Plunkett G, Burland V, Mau B, Petrosino JF, Qin X, Muzny DM, Ayele M, et al.: The complete genome sequence of Escherichia coli DH10B: insights into the biology of a laboratory workhorse.
J Bacteriol 2008, 190:25972606. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Hayashi K, Morooka N, Yamamoto Y, Fujita K, Isono K, Choi S, Ohtsubo E, Baba T, Wanner BL, Mori H, Horiuchi T: Highly accurate genome sequences of Escherichia coli K12 strains MG1655 and W3110.
Mol Syst Biol 2006, 2:2006 0007. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Edwards JS, Covert M, Palsson B: Metabolic modelling of microbes: the fluxbalance approach.
Environ Microbiol 2002, 4:133140. PubMed Abstract  Publisher Full Text