Departament d'Enginyeria Química, Universitat Rovira i Virgili, Avinguda Països Catalans 26, 43007-Tarragona, Spain

Technische Universität München. Fachgebiet für Systembiotechnologie, Boltzmannstr. 15 85748 Garching, Germany

Departament de Ciències Mèdiques Bàsiques, Institut de Recerca Biomèdica de Lleida (IRBLLEIDA), Universitat de Lleida, Montserrat Roig 2, 25008 Lleida, Spain

Abstract

Background

Design of newly engineered microbial strains for biotechnological purposes would greatly benefit from the development of realistic mathematical models for the processes to be optimized. Such models can then be analyzed and, with the development and application of appropriate optimization techniques, one could identify the modifications that need to be made to the organism in order to achieve the desired biotechnological goal. As appropriate models to perform such an analysis are necessarily non-linear and typically non-convex, finding their global optimum is a challenging task. Canonical modeling techniques, such as Generalized Mass Action (GMA) models based on the power-law formalism, offer a possible solution to this problem because they have a mathematical structure that enables the development of specific algorithms for global optimization.

Results

Based on the GMA canonical representation, we have developed in previous works a highly efficient optimization algorithm and a set of related strategies for understanding the evolution of adaptive responses in cellular metabolism. Here, we explore the possibility of recasting kinetic non-linear models into an equivalent GMA model, so that global optimization on the recast GMA model can be performed. With this technique, optimization is greatly facilitated and the results are transposable to the original non-linear problem. This procedure is straightforward for a particular class of non-linear models known as Saturable and Cooperative (SC) models that extend the power-law formalism to deal with saturation and cooperativity.

Conclusions

Our results show that recasting non-linear kinetic models into GMA models is indeed an appropriate strategy that helps overcoming some of the numerical difficulties that arise during the global optimization task.

1 Background

Identifying optimization strategies for increasing strain productivity should be possible by applying optimization methods to detailed kinetic models of the target metabolism. Thus, a rational approach would pinpoint the changes to be done - e.g. by modulating gene expression - in order to achieve the desired biotechnological goals

The bottom-up approach was the original strategy for model building in the biological sciences. Bottom-up kinetic models require information that is seldom available, despite the increasing amount of kinetic data contained in a growing set of databases (for example see

An additional issue that is common to models built using both strategies is that such detailed kinetic models include non-convexities that lead to the existence of multiple local optima in which standard non-linear optimization algorithms may get trapped during the search. Several stochastic and deterministic global optimization methods have been proposed to overcome this limitation

Given all these issues, it is hardly surprising that linear stoichiometric models have emerged as the most popular tool to analyze genome-wide metabolic networks using optimization techniques. Linear optimization problems can be solved using very fast and efficient algorithms

The possibility of condensing information about a very large network in a compact form enabled stoichiometric models to provide interesting insights in many different cases. However, the apparent simplicity in building and analyzing stoichiometric models comes at the cost of neglecting regulatory signals, metabolite levels and dynamic constraints. Accounting for these features in a dynamic way requires using more detailed, non-linear, mathematical models

These models go a step further than stoichiometric models by incorporating regulatory influences through a set of ordinary differential equations that can account for the system's dynamics. Building such models is often impossible because the appropriate functional form that needs to be used to describe the dynamical behavior of specific processes is in general unknown. Modeling strategies based on systematic approximated kinetic representations, such as power-laws

Although building and analyzing of comprehensive genome-wide detailed models is still not viable in most cases (see however

Efficient global optimization techniques are available for power-law models

The usefulness of the global optimization techniques developed for GMA models has been shown in the analysis of the adaptive response of yeast to heat shock

Based on ideas similar to those that led to the development of the power-law formalism, Sorribas et al.

Optimization of SC models faces a number of practical problems common to kinetic non-linear models

In this paper, and as a first step to define a framework for optimization of non-linear models with arbitrary form and extend FBA and related approaches to detailed kinetic models, we shall show the practical utility of recasting SC models into GMA models for optimization purposes. This technique is similar to the symbolic reformulation algorithm proposed by Smith and Pantelides

2 Results

2.1 Global optimization of non-linear models through recasting

For a proof of concept of the difficulties of global optimizing non-linear models and of the use of recasting for attaining practical solutions, we shall start by defining a reference biochemical network that corresponds to the reaction scheme in Figure _{5 }and four internal metabolites. The network includes six reactions and a branch point. _{3 }acts as a feed-back inhibitor of the synthesis of _{2}, while _{1 }is an activator of the synthesis of _{4}.

Branched network with feedback and feedforward regulation.

**Branched network with feedback and feedforward regulation.** X5 is a fixed external variable that can be varied at will. A GMA reference model is set-up by selecting appropriate parameters (see text).

The generic model for this system is:

Each of the velocities is a non-linear function of the involved metabolites. The SC representation, provides a systematic way for defining a functional model of this pathway. As a demonstrative example, let us suppose that the numerical model is:

In these equations, _{r}_{r }_{r}

We shall now address the following questions:

(i) To what extent can general purpose global optimization methods be applied to SC models?, (ii) Given that a SC model can be recast as a GMA (rGMA), is this useful for optimization of the original SC model?, (iii) Are the results obtained with the rGMA equivalent to the results of the original SC model?, and (iv) What are the practical advantages of optimizing a rGMA model?.

2.2 Optimization goals

In order to address the questions posed at the end of the previous section we shall define the following optimizations tasks (note that changes in enzyme activities and metabolite concentrations are constrained between 0.2 ≤ _{r }_{i }

• O1: What is the optimal pattern of changes in enzyme activities that maximizes the objective function in the new steady-state for a fixed value of _{5}?

• O2: What is the optimal pattern of changes in enzyme activities that maximizes the objective function in the new steady-state for a fixed value of _{5 }considering a maximum allowable variation of 10% in the steady-state values of the intermediaries?

• O3: What is the optimal pattern of changes in enzyme activities that maximizes the objective function in the new steady-state for a fixed value of _{5 }considering changes in the output flux from _{4 }of less than 10% with respect to its reference value?

• O4: What is the best set of changes, assuming that we can only manipulate three enzymes, that maximizes the objective function in the new steady-state for a fixed value of _{5 }considering a maximum variation of 10% in the steady-state values of the intermediaries?

Two different objective functions (OF), steady-state concentration of _{3 }and flux _{4}, have been considered for each optimization case, except for O3. This latter case has been optimized in terms only of the first objective (i.e., steady-state concentration of _{3}), because limits on _{4 }are already included in the formulation of the optimization problem.

2.3 Global optimization of SC models using BARON

We first address the optimization of the aforementioned model in their original SC form using state of the art global optimization techniques. The model was coded in the algebraic modeling system GAMS 23.0.2 and solved with the commercial global optimization package BARON v.8.1.5. on an Intel 1.2 GHz machine. An optimality gap (i.e., tolerance) of 0.2% was set in all the instances. As can be seen in Table

Results for the maximization of _{3 }and _{4 }and optimization goals O1-O4 using BARON v.8.1.5. for a tolerance of 0.2%.

**O**

**
k
_{1}
**

**
k
_{2}
**

**
k
_{3}
**

**
k
_{4}
**

**
k
_{5}
**

**
k
_{6}
**

**
X
_{3}
**

**OG (%)**

**CPU (s)**

1

0.26

5.00

4.97

0.20

0.20

0.54

8.30

0.20

136.17

2

0.20

0.24

0.22

0.20

0.21

0.20

1.10

0.00

0.06

3

0.60

5.00

5.00

0.53

0.20

0.27

5.39

0.20

96.39

4

0.99

1.15

1.00

0.96

1.00

1.00

1.10

0.00

1.42

**O**

**
k
_{1}
**

**
k
_{2}
**

**
k
_{3}
**

**
k
_{4}
**

**
k
_{5}
**

**
k
_{6}
**

**
v
_{4}
**

**OG (%)**

**CPU (s)**

1

4.61

5.00

5.00

5.00

0.72

1.20

37.40

0.20

157.83

2

3.22

3.73

5.00

4.99

0.21

0.22

31.33

0.00

1.67

3

0.88

0.94

0.88

0.96

0.23

3.00

6.60

0.00

10.53

4

1.16

1.00

1.34

1.34

1.00

1.00

7.61

0.00

3.61

Table _{4 }using the _{2 }and _{5}, and _{1 }and _{2 }that lead to the same objective function value is identified. Within these regions, one can decide which combination of changes should be selected based on additional cost arguments, as they all show the same performance in terms of the predefined objective function. This region could be further reduced by imposing additional constraints to the optimization.

Equivalent optimal solutions for the case S1-O1-v4.

**Equivalent optimal solutions for the case S1-O1-v4.** Blue points indicates results on the original SC model obtained with BARON. Red points identify solutions obtained for the corresponding rGMA and OA method (see text for details).

2.4 Recasting SC models into GMA models

Any SC model can be recast into a GMA canonical model by introducing the auxiliary variables

with appropriate initial conditions

For simulation purposes, model (3) is equivalent to the original SC model. As discussed in

2.5 Steady-state optimization of SC models through recasting

The steady-state solutions of Eqn. (4b) satisfy also Eqn. (4a). Thus, for optimization purposes, the steady-state constraints of interest are:

According to these results, the optimization problem can be stated as:

In our reference model, we shall consider the following constraints:

Once the problem has been recast into a rGMA, its mathematical structure can be exploited in order to improve the efficiency of the solution procedure, as demonstrated by the authors in previous works. This problem has a GMA form except for the auxiliary constraint 5b, which is required to recast the SC into the rGMA. This constraint can be easily handled by means of relaxation techniques and exponential transformations similar to those used by the authors in their global optimization algorithms for pure GMA models

As can be seen in Table

Results for the maximization of _{3 }and _{4 }using the rGMA model and optimization goals O1-O4 using the customized OA for a tolerance of 0.2%.

**O**

**
k
_{1}
**

**
k
_{2}
**

**
k
_{3}
**

**
k
_{4}
**

**
k
_{5}
**

**
k
_{6}
**

**
X
_{3}
**

**OG (%)**

**CPU (s)**

1

0.26

5.00

5.00

0.20

0.20

0.20

8.30

0.20

2.94

2

0.21

0.22

0.21

0.20

0.20

0.20

1.10

0.00

0.06

3

0.60

5.00

5.00

0.53

0.20

0.24

5.40

0.13

2.35

4

1.00

1.05

0.97

0.92

1.00

1.00

1.10

0.00

0.23

**O**

**
k
_{1}
**

**
k
_{2}
**

**
k
_{3}
**

**
k
_{4}
**

**
k
_{5}
**

**
k
_{6}
**

**
v
_{4}
**

**OG (%)**

**CPU (s)**

1

3.96

5.00

5.00

5.00

0.20

2.99

37.47

0.00

0.16

2

3.22

3.55

5.00

4.99

0.20

0.21

31.33

0.17

0.66

3

0.68

1.79

1.12

1.27

0.20

0.21

6.60

0.00

0.12

4

1.16

1.00

1.34

1.34

1.00

1.00

7.61

0.11

1.98

Note that the objective function values obtained with the SC and rGMA models only differ within the tolerance imposed. In some cases, discrepancies regarding the enzymatic profiles calculated are observed mainly due to the system's structure, that is, to the fact that the problem contains multiple solutions attaining the same performance in terms of objective function value but involving different enzymatic configurations, as discussed in section 2.3.

To further investigate this issue, we apply the multi-solution capability of BARON to the rGMA model (Figure

The region illustrated in Figure _{2 }and _{5 }in the region defined by constraints 4 ≤ _{2 }≤ 5 and 0.2 ≤ _{5 }≤ 0.8, and solve the optimization problem within each cell applying BARON to the SC model, and our OA to the rGMA model. Recall that these linear constraints define a region that contains that in Figure

Results (objective function) of the optimization of case O1- _{4 }for specific regions of _{2 }and _{5 }obtained with BARON for the SC model.

_{5}/_{2}

**1**

**2**

**3**

**4**

**5**

**6**

**7**

**8**

8

36.50

36.71

36.90

37.08

37.24

37.37

37.47

37.47

7

36.62

36.83

37.02

37.19

37.34

37.46

37.47

37.47

6

36.75

36.95

37.14

37.31

37.44

37.47

37.47

37.47

5

36.88

37.08

37.26

37.41

37.47

37.47

37.47

37.47

4

37.02

37.21

37.38

37.47

37.47

37.47

37.47

37.47

3

37.15

37.34

37.47

37.47

37.47

37.47

37.47

37.47

2

37.29

37.46

37.47

37.47

37.47

37.47

37.47

37.47

1

37.43

37.47

37.47

37.47

37.47

37.47

37.47

37.47

Domain of each _{r}_{2 }≤ 5;0.2 ≤ _{5 }≤ 0.8) has been split into 8 intervals with equal width.

Results (objective function) of the optimization of case O1-_{4 }for specific regions of _{2 }and _{5 }obtained with the customized OA for the rGMA model.

_{5}--_{2}

**1**

**2**

**3**

**4**

**5**

**6**

**7**

**8**

8

36.50

36.71

36.90

37.08

37.24

37.37

37.47

37.47

7

36.62

36.83

37.02

37.19

37.34

37.46

37.47

37.47

6

36.75

36.95

37.14

37.31

37.44

37.47

37.47

37.47

5

36.88

37.08

37.26

37.41

37.47

37.47

37.47

37.47

4

37.02

37.21

37.38

37.47

37.47

37.47

37.47

37.47

3

37.15

37.34

37.47

37.47

37.47

37.47

37.47

37.47

2

37.29

37.46

37.47

37.47

37.47

37.47

37.47

37.47

1

37.43

37.47

37.47

37.47

37.47

37.47

37.47

37.47

Domain of each _{r}(4 ≤ _{2 }≤ 5;0.2 ≤ _{5 }≤ 0.8) has been split into 8 intervals with equal width.

Results (CPU time in seconds) of the optimization of case O1- _{4 }for specific regions of _{2 }and _{5 }obtained with BARON for the SC model.

_{5}/_{2}

**1**

**2**

**3**

**4**

**5**

**6**

**7**

**8**

8

212.53

308.53

185.64

201.80

222.30

201.53

139.16

178.31

7

194.81

161.16

215.80

196.81

344.73

243.02

0.03

174.81

6

234.30

203.75

147.08

180.69

328.34

254.42

304.11

280.53

5

212.08

282.41

329.33

237.34

208.02

292.27

200.00

154.62

4

288.00

160.14

92.94

235.80

172.69

147.14

56.11

150.28

3

125.56

111.17

150.27

187.52

337.97

158.16

112.66

264.12

2

239.70

190.59

100.03

138.47

106.38

205.14

119.39

246.34

1

140.42

102.12

80.45

21.69

73.12

96.61

89.94

80.03

Domain of each _{r}(4 ≤ _{2 }≤ 5;0.2 ≤ _{5 }≤ 0.8) has been split into 8 intervals with equal width.

Results (CPU time in seconds) of the optimization of case O1-_{4 }for specific regions of _{2 }and _{5 }obtained with the customized OA for the rGMA model.

_{5}/_{2}

**1**

**2**

**3**

**4**

**5**

**6**

**7**

**8**

8

0.13

0.27

0.23

0.18

0.17

0.19

0.28

0.28

7

0.26

0.28

0.28

0.26

0.28

0.23

0.32

0.25

6

0.32

0.30

0.28

0.28

0.27

0.23

0.19

0.25

5

0.31

0.21

0.25

0.25

0.26

0.28

0.27

0.29

4

0.25

0.27

0.32

0.30

0.25

0.27

0.26

0.28

3

0.20

0.22

0.28

0.28

0.29

0.30

0.19

0.53

2

0.28

0.25

0.19

0.19

0.22

0.17

0.30

0.25

1

0.23

0.24

0.26

0.27

0.23

0.21

0.24

0.31

_{r}(4 ≤ _{2 }≤ 5;0.2 ≤ _{5 }≤ 0.8) has been split into 8 intervals with equal width.

2.6 Difficult optimization tasks can be solved via recasting

The reference model can be optimized either by general purpose techniques or by rGMA specific methods such as the customized OA. However, even with this simple example, we may encounter instances that are hard to solve using standard techniques. Consider, for instance, the same reaction scheme as before but this time with the alternative parameters indicated in the following model:

The optimization task of interest being:

• O5: Which is the optimal pattern of changes in enzyme activities that maximize _{6 }in the new steady-state for a fixed value of _{5 }and considering the following constraints?

When BARON is employed to solve this case using the native SC form, it cannot reduce the optimality gap below the specified tolerance after 1 hour of CPU time. In contrast, when the model is recast into its rGMA form and our OA method is applied, the global optimum can be determined with an optimality gap of 2% in 10.95 seconds (see Table

Results of the optimization of model 8 with BARON (SC model) and the customized OA (rGMA model).

**Solver**

**
k
_{1}
**

**
k
_{2}
**

**
k
_{3}
**

**
k
_{4}
**

**
k
_{5}
**

**
k
_{6}
**

**OF**

**OG (%)**

**CPU (s)**

BARON (SC)

6.24

5.16

0.46

0.6

8.46

9.09

60.36

45.18

3600

OA (rGMA)

6.25

5.17

0.45

0.6

8.44

9.1

60.46

2.18

10.95

3 Discussion

While experimental tools to manipulate gene expression are already available, there is no established set of guidelines on how these tools can be used to achieve a certain goal. So far, two main difficulties have prevented model driven optimization from becoming a standard in providing such guidelines: (i) the lack of information to build detailed kinetic models and (ii) the computational difficulties that arise upon the optimization of such models. The latter can be exemplified by the application of mixed integer non-linear optimization techniques (MINLP) in the context of kinetic models presented in

Our results can be of particular interest for dealing with multicriteria optimization on realistic models. This kind of problems are relevant when exploring the adaptive response to changing conditions, were conflictive goals may be at play

4 Conclusions

We expect that the possibility of building models using non-linear approximate formalisms and of subsequently optimizing these models will trigger interest in the experimental characterization of the components of cellular metabolism. After the genomic explosion, we need to step back and begin to measure enzyme activities, metabolite levels, and regulatory signals on a larger scale than we used to do before, if we want to understand the emergence of the dynamic properties of biological systems and to be able to develop successful biotechnological applications.

5 Methods

5.1 Modelling strategies

The process of model building and optimization can be used to understand how a system should be changed in order to achieve specific biotechnological goals or how the same system has evolved in order to more efficiently execute a given biological function. Different trade-offs are considered during the modeling process. On the one hand, one wants to use models that are as simple as possible to guarantee numerical tractability. Unfortunately simplifications may lead to models whose accuracy is only ensured for a limited range of physiological conditions. On the other hand, models that are very detailed and accurate over a wide range of physiological conditions are typically more difficult to analyze and optimize. Needless to say, the type of modeling strategy and the model one chooses to implement have a large impact on the results of the analysis. The most widely used strategies in the context of optimization are: (1) Stoichiometric models, (2) Kinetic models, and (3) Approximated models.

The three strategies have as a starting point a set of ordinary differential equations, in which the dependent variables or nodes are the chemical species whose dynamical behavior one is interested in studying. For a system with

_{ir }_{i }

At this stage, the various strategies begin to differ in the way that they implement and analyze the equations. Typically, Flux balance analysis (FBA) and related techniques consider only the steady state behavior of the system, and treat _{r }

This system of equations is solved under different assumptions. A typical problem is that of understanding the effect of knocking out different genes from the system. This analysis can be performed by setting _{r }

To overcome these limitations, we must use more complex kinetic models where the effect of changing the values of the variables on the fluxes is taken into account. This requires defining a functional form for each _{r }

As an alternative, theoretically well supported canonical representations can be derived using approximation theory. One type of such representations are power-law models. In a power-law model, each _{r }

This approximation is derived at a given operating point

Then, the aggregated processes are represented by power-law functions:

Alternatively, the GMA form is obtained representing each individual _{r }

The parameters in these representations have a clear physical interpretation. Kinetic orders, the exponents in the power-laws, are local sensitivities of the fluxes, either individual (_{rj }_{r}_{ij }_{ij }_{j}_{i}_{i }_{r}

To complement the power-law approach, the Saturable and Cooperative (SC) formalism was introduced by Sorribas et al.

This representation can be obtained from a power-law model defined at a given operating point _{0 }= (_{10},.., _{(n + m)0}) through the following relationships:

Thus SC uses the same information as the power-law except for the new parameters _{rj }

where _{r0 }_{r}_{10},.., _{n0}_{(n + m)0}) and _{rj }_{j }_{rj }_{j }_{rj }

Using SC models for global optimization can raise some numerical issues. These difficulties can be avoided to a large extent by recasting SC models into a canonical GMA model, through the introduction of auxiliary variables, as will be shown in the next section.

5.2 Recasting non-linear models into power-law canonical models by increasing the number of variables

Non-linear models can be

As a very simple introductory example, consider a linear pathway with two internal metabolites _{1 }and _{2 }and a source metabolite _{3 }(Figure _{2 }is a competitive inhibitor of the synthesis of _{1 }from the source metabolite. A generic model using Michaelis-Menten kinetic functions, assuming a competitive inhibition of the first reaction by _{2}, can be written as:

A simple linear network.

**A simple linear network.**

in which _{3 }is an externally fixed variable.

Recasting this model as a rGMA can be done as follows. First, let us define three new variables:

We can now write the model in 20 as:

with initial conditions

To complete the recasting we must now provide the equations that follow the change in the new variables over time. These are given by the following equations:

with initial conditions

The resulting rGMA model (22-23) is an exact representation of model in (20). Hence, for a set of appropriate initial conditions, the simulation of the dynamic response using either the model recast as a rGMA or the original model will produce the same trajectory. In principle, any non-linear model can be recast into a rGMA following a similar procedure

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

AM-S suggested the potential utility of recasting for optimizing non-linear kinetic models. AS and AM-S elaborate on the recasting of SC models and planned the work. CP, GG-G and LJ implemented the OA algorithm and worked out the technical solution for applying it to a rGMA model. CP and GG-G performed the optimization tasks. AS and RA defined the reference model and obtained the numerical parameters used in the paper. All authors read and approved the final manuscript.

Acknowledgements

AS is funded by MICINN (Spain) (BFU2008-0196). RA is partially supported by MICINN (Spain) through Grants BFU2007-62772/BMC and BFU2010-17704). AS and RA are members of the 2009SGR809 research group of the Generalitat de Catalunya. GG-G and CP acknowledges support from the Spanish Ministry of Science and Innovation (Projects DPI2008-04099 and CTQ2009-14420-C02-01) and the Spanish Ministry of External Affairs and Cooperation (Projects A/023551/09 and A/031707/10).