Departament d’Enginyeria Química, Universitat Rovira i Virgili, Tarragona, Spain
Departamento de Matemática Aplicada y Estadística, Universidad Politécnica de Cartagena, Cartagena, Spain
Abstract
Background
The estimation of parameter values for mathematical models of biological systems is an optimization problem that is particularly challenging due to the nonlinearities involved. One major difficulty is the existence of multiple minima in which standard optimization methods may fall during the search. Deterministic global optimization methods overcome this limitation, ensuring convergence to the global optimum within a desired tolerance. Global optimization techniques are usually classified into stochastic and deterministic. The former typically lead to lower CPU times but offer no guarantee of convergence to the global minimum in a finite number of iterations. In contrast, deterministic methods provide solutions of a given quality (i.e., optimality gap), but tend to lead to large computational burdens.
Results
This work presents a deterministic outer approximationbased algorithm for the global optimization of dynamic problems arising in the parameter estimation of models of biological systems. Our approach, which offers a theoretical guarantee of convergence to global minimum, is based on reformulating the set of ordinary differential equations into an equivalent set of algebraic equations through the use of orthogonal collocation methods, giving rise to a nonconvex nonlinear programming (NLP) problem. This nonconvex NLP is decomposed into two hierarchical levels: a master mixedinteger linear programming problem (MILP) that provides a rigorous lower bound on the optimal solution, and a reducedspace slave NLP that yields an upper bound. The algorithm iterates between these two levels until a termination criterion is satisfied.
Conclusion
The capabilities of our approach were tested in two benchmark problems, in which the performance of our algorithm was compared with that of the commercial global optimization package BARON. The proposed strategy produced near optimal solutions (i.e., within a desired tolerance) in a fraction of the CPU time required by BARON.
Background
Elucidation of biological systems has gained wider interest in the last decade. Despite recent advances, fundamental understanding of life processes still requires powerful theoretical tools from mathematics and physical sciences. Particularly, mathematical modelling of biological systems is nowadays becoming an essential partner of experimental work. One of the most challenging tasks in computational modelling of biological systems is the estimation of the model parameters. The aim here is to obtain the set of parameter values that make the model response consistent with the data observed. Parameter estimation can be formulated as an optimization problem in which the sum of squared residuals between the measured and simulated data is minimized. The biological model dictates the type of optimization problem being faced. Many biological systems are described through nonlinear ordinary differential equations (ODEs) that provide the concentration profiles of certain metabolites over time. Recent methodological developments have enabled the generation of some dynamic profiles of gene networks and protein expression data, although the latter are still very rare. In this context, there is a strong motivation for developing systematic techniques for building dynamic biological models from experimental data. The parameter estimation of these models gives rise to dynamic optimization problems which are hard to solve.
Existing approaches to optimize dynamic models can be roughly classified as direct or indirect (also known as variational)
Models of biological systems are typically highly nonlinear, which gives rise to nonconvex optimization problems with multiple local solutions (i.e., multimodality). Because of this, traditional gradientbased methods used in the sequential and simultaneous approaches may fall in local optima. In the context of parameter estimation, these local solutions should be avoided, since they may lead to inaccurate models that are unable to predict the system’s performance precisely.
Global optimization (GO) algorithms are a special class of techniques that attempt to identify the global optimum in nonconvex problems. These methods can be classified as stochastic and deterministic. Stochastic GO methods are based on probabilistic algorithms that provide near optimal solutions in short CPU times. Despite having shown great potential with largescale problems like parameter estimation
A rigorous lower bound on the global optimum of the original nonconvex problem is obtained by solving a valid relaxation that contains its feasible space. To construct this relaxed problem, the nonconvex terms in the original formulation are replaced by convex envelopes that overestimate its feasible region. There are different types of convex envelopes that provide relaxations for a wide variety of nonconvexities. These relaxations are the main ingredient of deterministic GO methods and play a key role in their performance. In general, tighter relaxations provide better bounds (i.e., closer to the global optimum), thereby expediting the overall solution procedure.
To the best of our knowledge, Esposito and Floudas were the first to propose a deterministic method for the global solution of dynamic optimization problems with embedded ODEs
This work proposes a computational framework for the deterministic global optimization of parameter estimation problems of nonlinear dynamic biological systems. The main contributions of our work are: (1) the application of deterministic global optimization methods to dynamic models of biological systems, and (2) the use of several known techniques employed in dynamic (i.e., orthogonal collocation on finite elements) and global optimization (i.e., symbolic reformulation of NLPs and piecewise McCormick envelopes) in the context of an outer approximation algorithm. The approach presented relies on discretizing the set of nonlinear ODEs using orthogonal collocation on finite elements, thereby transforming the dynamic system into an equivalent nonconvex NLP problem. A customized outer approximation algorithm that relies on a mixedinteger linear programming (MILP) relaxation is used in an iterative scheme along with the aforementioned NLP to solve the nonconvex model to global optimality. The MILP relaxation is tightened using a special type of cutting plane that exploits the problem structure, thereby expediting the overall solution procedure.
The capabilities of our algorithm are tested through its application to two case studies: the isomerisation of
Methods
Problem statement
The problem addressed in this work can be stated as follows: given is a dynamic kinetic model describing the mechanism of a set of biochemical reactions. The goal is to determine the appropriate values of the model coefficients (e.g., rate constants, initial conditions, etc.), so as to minimize the sumofsquares of the residuals between the simulated data provided by the model and the experimental observations.
Mathematical formulation
We consider dynamic parameter estimation optimization problems of the following form:
Where
Our solution strategy relies on reformulating the nonlinear dynamic optimization problem as a finitedimensional NLP by applying a complete discretization using orthogonal collocation on finite elements. This NLP is next solved using an outer approximation algorithm (see Figure
Solution Strategy.
Solution Strategy. The system of ODEs is first reformulated into a nonconvex NLP using the orthogonal collocation on finite elements approach. This NLP is decomposed into two levels: a master MILP and a slave NLP. The master MILP, which is constructed using piecewise McCormick envelopes and supporting hyperplanes, provides a rigorous lower bound on the global optimum. The slave NLP corresponds to the original nonconvex NLP that is solved using as starting point the solution of the MILP. The algorithm iterates between these two levels until the optimality gap (i.e., the relative difference between the upper and lower bounds) is reduced below a given tolerance.
Orthogonal collocation approach
There is a considerable number of collocationbased discretizations for the solution of differentialalgebraic systems
The state variables are first approximated using Lagrange polynomials as follows:
These polynomials have the property that at the orthogonal collocation points their coefficients,
Because state variables may present steep variations, the whole solution space is commonly divided into time intervals called finite elements. Hence, the time variable
Orthogonal collocation discretization over finite elements.
Orthogonal collocation discretization over finite elements. The time interval is divided into
Following the collocation method
The state variables have to be continuous between elements, so we enforce the following continuity constrains:
These equations extrapolate the polynomial at element
Moreover, initial conditions are enforced for the beginning of the first element using the following equation:
Recall that collocation points in which time has been discretized will not necessarily match the times at which experimental profiles were registered. Hence, variable
Where
Here, the subscript
NPL formulation
The dynamic optimization problem is finally reformulated into the following NLP:
Results and discussion
Optimization approach
The method devised for globally optimizing the NLP that arises from the reformulation of the parameter estimation problem (Eqs. 1317) is based on an outer approximation algorithm
Optimization algorithm based on outer approximation.
Optimization algorithm based on outer approximation. Our approach decomposes the problem into two subproblems: a master MILP, constructed by relaxing the original model using piecewise McCormick envelopes and hyperplanes, that provides a lower bound, and a slave NLP that yields an upper bound. The algorithm iterates between these two levels until a termination criterion is satisfied.
Lower level master problem
Designing efficient and smart strategies for attaining tight bounds is a mayor challenge in deterministic global optimization. Both the quality of the bounds and the time required to generate them drastically influence the overall performance of a deterministic global optimization algorithm.
Any feasible solution of the original NLP is a valid upper bound and can be obtained by means of a local NLP solver. To obtain lower bounds, we require a rigorous convex (linear or nonlinear) relaxation. This relaxation is obtained by replacing the nonconvex terms by convex overestimators. Since the relaxed problem is convex, it is possible to solve it to global optimality using standard local optimizers. Furthermore, since its feasible region contains that of the original problem and its objective function rigorously underestimates the original one, it is guaranteed to provide a lower bound on the global optimum of the original nonconvex model
Androulakis et al.
To construct a valid MILP relaxation, we apply the following approach. We first reformulate the NLP using the symbolic reformulation method proposed by Smith and Pantelides
where vector
A rigorous relaxation of the original model is constructed by replacing the nonconvex terms in the reformulated model by convex estimators. The solution of the convex relaxation provides a valid lower bound on the global optimum. More precisely, the bilinear terms are replaced by piecewise McCormick relaxations. The fractional terms can be convexified in two different manners. The first is to replace them by tailored convex envelopes that exploit their structure
The reader is referred to the work by Smith and Pantelides
Piecewise McCormickbased relaxation
The bilinear terms appearing in the reformulated model are approximated using McCormick’s envelopes
Each bilinear term
The best known relaxation for approximating a bilinear term is given by the McCormick envelopes, obtained by replacing Eq. 26 by the following linear under (Eqs. 27 and 28), and overestimators (Eqs. 29 and 30):
In this work we further tighten the McCormick envelopes by adding binary variables
Binary switch:
Continuous switch:
The binary switch
McCormick convex relaxation over the entire feasible region (subfigure (a)) compared to a piecewise McCormick relaxation over a smaller active region (subfigure (b)) where the tightness of the relaxation is improved.
McCormick convex relaxation over the entire feasible region (subfigure (a)) compared to a piecewise McCormick relaxation over a smaller active region (subfigure (b)) where the tightness of the relaxation is improved. We built the master problem by replacing the bilinear terms by piecewise McCormick envelopes. The relaxation can be further improved by adding binary variables.
Eq. 31 enforces that only one binary variable is active:
The continuous switch Δ
Finally, the under and overestimators for the active segment are defined in algebraic terms as follows:
Note that the discrete relaxation is tighter than the continuous one over the entire feasible region. The introduction of the binary variables required in the piecewise McCormick reformulation gives rise to a mixedinteger nonlinear programming (MINLP) problem, with the only nonlinearities appearing in the objective function. While this MINLP is convex and can be easily solved to global optimality with standard MINLP solvers, it is more convenient to linearize it in order to obtain an MILP formulation, for which more efficient software packages exist. The section that follows explains how this is accomplished.
Hyperplanes underestimation
The convex MINLP can be further reformulated into an MILP by replacing the objective function by a set of hyperplanes. For this, we define two new variables as
Upper level slave problem
A valid upper bound on the global optimum is obtained by optimizing the original NLP locally. This NLP is initialized using the solution provided by the MILP as starting point. The solution of this NLP is used to tighten the MILP, so the lower and upper bounds tend to converge as iterations proceed.
Algorithm steps
The proposed algorithm comprises the following steps:
1. Set iteration count it = 0, UB =
2. Set it=it + 1. Solve the master problem MILP.
(a) If the MILP is infeasible, stop (since the NLP is also infeasible).
(b) Otherwise, update the current LB making
3. Solve the slave problem NLP.
(a) If the NLP is infeasible add one more piecewise term and hyperplane to the master MILP and go to step 2 of the algorithm.
(b) Otherwise, update the current UB making UB =
4. Calculate the optimality gap OG as
(a) If OG≤tol, then stop. The current UB is regarded as the global optimum within the desired tolerance.
(b) Otherwise, add one more piecewise section and hyperplane to the master MILP and go to step 2 of the algorithm.
Remarks:
There are different methods to update the piecewise bilinear approximation. One possible strategy is to update it by dividing the active piecewise (i.e., the piecewise term in which the solution is located) into two equallength segments.
The new hyperplane term
The univariate convex and concave terms in the reformulated problem can be either approximated by the secant or by a piecewise univariate function similarly as done with the McCormick envelopes.
Our algorithm needs to be tuned prior to its application. This is a common practice in any optimization algorithm. In a previous publication
The approach presented might lead to large computational burdens in largescale models of complex biological systems. Future work will focus on expediting our algorithm through the addition of cutting planes and the use of customized decomposition strategies.
Case studies
We illustrate the performance of the proposed algorithm through its application to two challenging benchmark parameter estimation problems: the isomerisation of
Isomerisation of
Inhibition of HIV proteinase
MILP equations
1,836
138,128
MILP continuous variables
1,096
53,321
MILP binary variables
380
3,625
NLP equations
186
16,306
NLP variables
196
16,361
Case study 1: Isomerisation of
In this first case study, five kinetic parameters describing the thermal isomerisation of
Proposed mechanism describing the thermal isomerization of
Proposed mechanism describing the thermal isomerization of
Hunter and McGregor
RodriguezFernandez et al.
Following our approach, the state variables were approximated by Lagrange polynomials using three collocation points evaluated at the shifted roots of orthogonal Legendre polynomials and defining five finite elements of equal length. The nonconvexities in the resulting residual equations are given by the bilinear terms
It is well known that the quality of the lower bound predicted by a relaxation strongly depends on the bounds imposed on its variables
which forces the model to find a solution better than the one obtained at the beginning of the search by locally minimizing the original NLP (i.e., 20 is a rigorous upper bound for the objective function). Furthermore, the parameter
The problem was solved with 6 initial hyperplanes. An extra hyperplane was added in each iteration, but the total number of piecewise terms was kept constant (4 piecewise intervals were considered) in order to keep the MILP in a manageable size. A tolerance of 5% was set as termination criterion.
For comparison purposes, we solved the same problem with the standard global optimization package BARON using its default settings. BARON was able to find the global optimum but failed at reducing the optimality gap below the specified tolerance after 12h of CPU time. In contrast, our algorithm closed the gap in less than 3h (see Table
RodriguezFernandez et al.
BARON
Proposed algorithm
Sum of squares
19.87
19.87
19.87
UB

19.87
19.87
LB

4.112
19.26
Gap (%)

79.31
3.056
Iterations
9,518
60,614
2
Time (CPU s)
122
43,200
8,916
Case study 2: Inhibition of HIV proteinase
In this second case study, we considered a much more complex biological dynamic system. Particularly, we studied the reaction mechanism of the irreversible inhibition of HIV proteinase, as originally examined by Kuzmic
Proposed mechanism describing the irreversible inhibition of HIV proteinase.
Proposed mechanism describing the irreversible inhibition of HIV proteinase. The enzyme HIV proteinase (E), which is only active in a dimer form, was added to a solution of an irreversible inhibitor (I) and a fluorogenic substrate (S). The product (P) is a competitive inhibitor for the substrate.
The model can be described mathematically through a set of 9 nonlinear ODE’s with ten parameters:
where the following initial conditions and parameters are known:
A series of five experiments where the enzyme HIV proteinase (E) (assay concentration 0.004
The fluorescence changes were monitored during one hour. The measured signal is a linear function of the product (P) concentration, as expressed in the following equation:
In this fit, the offset (baseline) of the fluorimeter was considered as a degree of freedom. A certain degree of uncertainty (±50%) was assumed for the value of the initial concentrations of substrate and enzyme (titration errors).
The calibration of a total of 20 adjustable parameters was addressed: five rate constants, five initial concentrations of enzyme and substrate and five offset values. Mendes and Kell
In our study, the state variables were approximated using five orthogonal collocation points and five equallength finite elements. In this case, the nonconvexities arise from the bilinear terms
The parameter bounds
The master problem was further tightened by adding a special type of strengthening cuts. These cuts are generated by temporally decomposing the original full space MILP into a series of MILPs in each of which we fit only a subset of the original dataset, and remove the continuity equations corresponding to the extreme elements included in the subproblem. The cuts are expressed as inequalities added to the master problem that impose lower bounds on the error of a subset of elements for which the subMILPs are solved. These bounds are hence obtained from the solution of a set of MILP subproblems that optimize the error of only a subset of elements.
This case study was solved with 3 initial piecewise intervals and 6 initial hyperplanes. Two strengthening cuts involving elements 1, 2, 3 and 4, and 2, 3, 4 and 5, respectively, were added as constrains. A tolerance of 20% was used in the calculations. Hyperplanes and piecewise terms were updated at each iteration of the algorithm. In this case, BARON failed to identify any feasible solution after 12h of CPU time.
In contrast, our algorithm was able to obtain the global optimum (Table
Parameter
RodriguezFernandez et al.
Proposed algorithm
Sum of squares
0.01997
0.01961
k_{3} (s^{−1})
6.235
5.764
k_{42} (s^{−1})
8,772
968.7
k_{22} (s^{−1})
473
129.9
k_{52} (s^{−1})
0.09726
0.01612
k_{6} (s^{−1})
0.01417
0.01337
S_{0} exp. 1 (
24.63
24.61
S_{0} exp. 2 (
23.32
23.4
S_{0} exp. 3 (
26.93
27.05
S_{0} exp. 4 (
13.34
13.97
S_{0} exp. 5 (
12.5
12.5
E_{0} exp. 1 (
0.005516
0.005286
E_{0} exp. 2 (
0.005321
0.005168
E_{0} exp. 3 (
0.006
0.006
E_{0} exp. 4 (
0.004391
0.004428
E_{0} exp. 5 (
0.003981
0.004105
offset exp. 1
0.004339
0.004234
offset exp. 2
0.001577
0.003478
offset exp. 3
0.01117
0.0142
offset exp. 4
0.001661
0.005177
offset exp. 5
0.007133
0.00486
RodriguezFernandez et al.
BARON
Proposed algorithm
Sum of squares
0.01997
failed
0.01961
UB


0.01961
LB


0.01595
Gap (%)


18.64
Iterations
29,345
263
3
Time (CPU s)
1,294
43,200
4,351
Conclusions
In this work, we have proposed a novel strategy for globally optimizing parameter estimation problems with embedded nonlinear dynamic systems. The method presented was tested through two challenging benchmark problems: the isomerisation of
The proposed algorithm identified the best known solution, which was originally reported by RodriguezFernandez et al.
The method proposed produced promising results, surpassing the capabilities of BARON. Our method requires some knowledge on optimization theory as well as skills using modelling systems. Our final goal is to develop a software to automate the calculations, so our approach can be easily used by a wider community. This is a challenging task, since nonlinear models are hard to handle and typically require customized solution procedures. Particularly, nonlinear models must be initialized carefully to ensure convergence even to a local solution. In this regard, the use of an outer approximation scheme that relies on a master MILP formulation is quite appealing, since the outcome of this MILP can be used to initialize the NLP in a robust manner.
Another key point here is how to construct tight relaxations of the nonconvex terms. An efficient algorithm must exploit the problem structure to obtain high quality relaxations and therefore good bounds close to the global optimum. These relaxations can be further tightened through the addition of cutting planes or the use of customized decomposition methods. As observed, there is still much work to be done in this area, but we strongly believe that such an effort is worthy. Furthermore, recent advances in global optimization theory and software applications are paving the way to develop systematic deterministic tools for the global optimization of parameter estimation problems of increasing size. Our future work will focus on making the approach more efficient through the use of tailored cutting planes and decomposition strategies and also through the hybridization of deterministic methods with stochastic approaches.
Competing interest
The authors declare that they have no competing interests.
Authors’ contributions
G. GG suggested the need for the approach and J.A. E provided the biological problems. A. M, C. P, G. GG and L. J developed the optimization algorithms and performed the numerical analysis. All authors evaluated the results, wrote the paper and contributed to its final form.
Acknowledgements
This work was supported by the Spanish Ministry of Science and Innovation through the doctoral research grant FPIMICINN reference grant BES2010037166 and through the project CTQ200914420C0201.