Institute of Physics, University of Freiburg, Germany

Freiburg Centre for Systems Biology, Germany

Process Engineering Group, Spanish Council for Scientific Research, IIM-CSIC, Spain

Abstract

Background

Modeling and simulation of cellular signaling and metabolic pathways as networks of biochemical reactions yields sets of non-linear ordinary differential equations. These models usually depend on several parameters and initial conditions. If these parameters are unknown, results from simulation studies can be misleading. Such a scenario can be avoided by fitting the model to experimental data before analyzing the system. This involves parameter estimation which is usually performed by minimizing a cost function which quantifies the difference between model predictions and measurements. Mathematically, this is formulated as a non-linear optimization problem which often results to be multi-modal (non-convex), rendering local optimization methods detrimental.

Results

In this work we propose a new hybrid global method, based on the combination of an evolutionary search strategy with a local multiple-shooting approach, which offers a reliable and efficient alternative for the solution of large scale parameter estimation problems.

Conclusion

The presented new hybrid strategy offers two main advantages over previous approaches: First, it is equipped with a switching strategy which allows the systematic determination of the transition from the local to global search. This avoids computationally expensive tests in advance. Second, using multiple-shooting as the local search procedure reduces the multi-modality of the non-linear optimization problem significantly. Because multiple-shooting avoids possible spurious solutions in the vicinity of the global optimum it often outperforms the frequently used initial value approach (single-shooting). Thereby, the use of multiple-shooting yields an enhanced robustness of the hybrid approach.

Background

The goal of systems biology is to shed light onto the functionality of living cells and how they can be influenced to achieve a certain behavior. Systems Biology therefore aims to provide a holistic view of the interaction and the dynamical relation between various intracellular biochemical pathways. Often, such pathways are qualitatively known which serves as a starting point for deriving a mathematical model. In these models, however, most of the parameters are generally unknown, which thus hampers the possibility for performing quantitative predictions. Modern experimental techniques can be used to obtain time-series data of the biological system under consideration from which unknown parameters values can be estimated. Since these data are often sparsely sampled, parameter estimation is still an important challenge in these systems. On the other hand, the use of model-based (

Parameter estimation is usually performed by minimizing a cost function which quantifies the differences between model predictions and measured data. In general, this is mathematically formulated as a non-linear optimization problem which often results to be multi-modal (non-convex). Most of the currently available optimization algorithms, specially local deterministic methods, may lead to suboptimal solutions if multiple local optima are present, as shown in

In this work a refined hybrid strategy is proposed which offers two main advantages over previous alternatives

Parameter estimation in dynamical systems

Generally, the parameter estimation problem can be stated as follows. Suppose that a dynamical system is given by the ^{d }at time _{0}, _{f}], which is the unique and differentiable solution of the initial value problem

The right-hand side of the ODE depends in addition on some parameters _{ij }denote the data of measurement _{ij }satisfies the observation equation

_{ij }= _{j}(_{i}), _{ij}ε_{ij }

for some observation function ^{d }→ ℝ^{N}, _{ij }> 0, where _{i}'s are independent and standard Gaussian distributed random variables. The sample points _{i }are ordered such that _{0 }≤ _{1 }< ...; <_{n }≤ _{f }and the observation function

_{ijk }= _{j}(_{i}), _{ijk}_{ijk } _{exp}.

Certain parameters may be different for each experiment, but the treatment of these local parameters and the different experiments requires only obvious modifications of the described procedures and therefore only the single-experiment design _{exp }= 1 is discussed in the following for sake of clarity.

On the basis of the measurements (_{i})_{i = 1,...,n }the task is now to estimate the initial state _{0 }and the parameters _{0 }and _{i}; _{0}, _{i}, the cost function is then given by

In general, minimizing ℒ is a formidable task, which requires advanced numerical techniques.

Methods

Mathematical modeling in systems biology rely on quantitative information of biological components and their reaction kinetics. Due to paucity of quantitative data, various numerical optimization techniques have been employed to estimate parameters of such biological systems. Employed optimization techniques include local, deterministic approaches like Levenberg-Marquardt algorithm, Sequential Quadratic Programming, and stochastic approaches like Simulated Annealing, Genetic Algorithms and Evolutionary Algorithms (see for example, _{0 }and parameters

One of the simplest global methods is a multistart method. Here, a large amount of initial guesses are drawn from a distribution and subjected to a parameter estimation algorithm based on a local optimization approach. The smallest minimum is then regarded as being the global optimum. In practice, however, there is no guarantee of arriving to the global solution and the computational effort can be quite large. These difficulties are arising because it is a-priori not clear how many random initial guesses are necessary. Over the last decade more suitable techniques for the solution of multi-modal optimization problems have been developed (see, e.g.,

In

Multiple-shooting

Detailed discussion and some applications to measured data of the method can be found, e.g., in _{0}, _{f}] into _{ms }<_{k }such that each interval contains at least one measurement. Each of the intervals are assigned to an individual experiment having its own initial values _{i}; _{ms }denotes the trajectory within an interval. Since the total trajectory for each _{1 }∪ ... ∪

For each _{ms }let _{k }= (

where the continuity constraints are given at the first row of the constraints-part, followed by optional constraints _{1}, ...

We solved the non-linear programming problem defined by Eq. (5) iteratively by employing a generalized-quasi-Newton method

where d_{θ }denotes the derivative with respect to the parameters ^{l }= ^{l-1 }+ Δ^{l }and repeating Eq. (7) until Δ^{l }≈ 0, yields the desired parameter estimates under the condition that all parameters itself are identifiable and the constraints are not contradictory. These extra assumptions are necessary to fulfil the so called Kuhn-Tucker conditions for the solvability of constrained, non-linear optimization problems

In combination with multiple-shooting the generalized-quasi-Newton approach has three major advantages: first, the optimization is sub-quadratically convergent. Second, a transformation of Eqs. (7) can be found such that the transformed equations are numerically equivalent to the initial value approach. Third, due to the linearization of the continuity constraints, they do not have to be fulfilled exactly after each iteration, but only at convergence. This allows discontinuous trajectories during the optimization process, reducing the problem of local minima. The first two properties yield the desired speed of convergence whereas the third property is mainly responsible for the stability of multiple-shooting. This is gained by the possibility that the algorithm can circumvent local minima by allowing for discontinuous trajectories while searching the global minimum. Whereas, the main disadvantage is due to the linearization of the cost function. It can easily happen that despite the update step Δ^{l }is pointing in the direction of decreasing ℒ the proposed step is too large. Such an overshooting is common to any simple optimization procedures based on the local approximation of the cost function. A suitable approach to cure this deficiency is realized by relaxing the update step; hence ^{l }= ^{l-1 }+ λ^{l}Δ^{l }for some λ^{l }∈ (0, 1]. This procedure is referred to as damping and provides the bases of the determination of the switching point which we propose in the following.

A new hybrid method

Besides the choice of the global and local optimization procedure, the determination of the switching point is vital for the robustness of the hybrid approach, as discussed in

Calculation of the switching point

The multiple-shooting method is equipped with a relaxation algorithm to prevent overshooting of the update step. This overshooting is due to the quadratic approximation of the likelihood function in Eq. (7) which is often too crude for points far away of the minimum. For these points the calculated update step tends to be too long and might result in a step leading to an increased value of the cost function. The relaxation method, also called damping method, selects some λ^{l }∈ (0, 1] such that the update step ^{l }= ^{l-1 }+ λ^{l}Δ^{l }is descendant. For this some level function has to be used. Such a level function must share the same monotony properties of the cost function close to the global minimum. Here, the objective to judge whether the proposed step at ^{l-1 }is descendant is given by the following level function

^{l-1})^{a}(^{l-1 }+ λΔ^{l})||^{2},

where ^{a }is the ^{l }= ^{l-1})^{a}(^{l-1}). Based on _{1}, of consecutive λ = 1 is achieved. After the initialization of the method a number of iterations _{0 }is performed using the global method without checking the switching point criterion in order to decrease the computational load, note that a minimum of around 15 iterations will be usually needed, this number may be increases if the size of the search space also increases. For the simulations presented in this study _{1 }= 2. Since the corrector-predictor scheme can be implemented very efficiently, calculation of the damping parameter λ is computationally inexpensive.

Results and Discussion

In order to demonstrate the performance of the method we have chosen two examples: the STAT5 signaling pathway

STAT5 signaling pathway

The JAK/STAT (Janus kinase/Signal Transducer and Activator of Transcription) signaling cascade is a well studied pathway stimulating cell proliferation, differentiation, cell migration and apoptosis

where _{1}, _{2 }are rate constants and _{1}, whereas _{2 }denotes the phosphorylated STAT5. Moreover, _{3 }describes the dimer and _{4 }is the nuclear STAT5. The receptor activity is denoted by _{A}(

Here, _{3}(_{3}(_{1 }= _{1}(_{2 }+ _{3}), and the total amount of STAT5 in the cytoplasm, _{2 }= _{2}(_{1 }+ _{2 }+ _{3}), where _{1 }and _{2 }are scaling parameters introduced to deal with the fact that only relative protein amounts are measured. Initial conditions and the kinetic parameters were chosen to be: _{1}(0) = 3.71, _{i}(0) = 0, (_{1 }= 2.12, _{2 }= 0.109, _{1 }= 0.33 and _{2 }= 0.26. From the simulated data we aim to estimate the rate constants _{1}, _{2}, the delay parameter _{1}(0). In case of local optimization methods – single and multiple-shooting – we used multistarts, where the initial guess of each restart is randomly chosen from the intervals [0, 5] (Box 5), [0, 10] (Box 10), and [0, 100] (Box 100), respectively, using a uniform distribution. For each box size 100 restarts are chosen. Note that the delay parameter τ has to be restricted to Δ_{f }- _{0}), where Δ_{f }- _{0}.

The results are given in Figure a showing the percentage of convergence to the global minimum, local minima or failure of Box 5, Box 10, and Box 100, respectively. In the rather artificial case of zero noise shown in Figure a multiple-shooting performs reasonably well while already a significant fraction of the single shooting trials converge to a local mimimum. Figure b presents the results obtained using data with 10% noise to signal ratio. Adding noise deteriorates the performance of both approaches, which can be seen by comparing Figure a and Figure b. As anticipated, multiple-shooting outperforms single shooting, since it reduces the multimodality of the problem. However, multiple-shooting tends to fail more often than single-shooting for large box sizes. Even for this rather simple example the chance of getting trapped in a local solution or to fail is quite significant and increases with increasing noise to signal ratio. The corresponding total computational costs for both methods are summarized in Table

Computational costs in the STAT5 case study (in seconds) for 0% and 10% noise to signal ratio, respectively.

Simulated data with 0%/10% noise

Box Size

SS

MS

SRES

Hybrid

5

65/80

140/155

30/46

9/10

10

86/90

317/453

34/55

10/11

100

141/170

950/1095

58/80

17/22

The CPU time is normalized using the Linpack benchmark table and is in case of the multistarts of the single shooting (SS) and multiple-shooting (MS) method the sum over all restarts. Increased robustness of MS results in substantially higher computational cost compared to SS. The hybrid is about 3–4 times faster than SRES manifesting the advantage of the proposed method.

In contrast to the local methods, both, the global search strategy SRES and the hybrid approach, converged in all cases to the global optimum which emphasises the strength of global methods. Note that results obtained by DE are comparable to SRES and are therefore omitted. The power of the hybrid strategy can be appreciated considering the average computational cost as shown in Table

Oscillatory feedback control system: Goodwin's model

Parameter estimation for oscillating systems is usually more involved than for systems showing a transient behavior. A well known model describing oscillations in enzyme kinetics is the model suggested by Goodwin

Here,

As with the previous case the problem is first approached using multistarts where either single shooting or multiple-shooting are employed. The initial guess of each restart is randomly chosen from the intervals [0, 5] (Box 5), [0, 10] (Box 10) and [0, 100] (Box 100), respectively, for both the parameters and initial conditions using a uniform distribution and two values 0% and 10% noise to signal ratio. The results are summarized in Figure showing the percentage of convergence to the global minimum, local minima or failure for different box sizes. Both local methods encounter difficulties in finding the global optimum, single shooting fastly steps in local minimima or diverges and only on a reduced percentage of the runs converges to the global solution, whereas multiple-shooting performs in all cases better than single shooting at the expense of higher computational costs. In case of the global approaches only DE, under the choice of robust thus slower strategy parameters, was able to find the global minimum, whereas no convergent fit was obtained using SRES. This emphasizes the difficulties in finding the optimal solution for oscillatory systems even for global search strategies. Figure (**a**: 0% noise to signal ratio, **b**: 10% noise-to-signal ratio) shows representative convergence curves for the DE and the hybrid to the global optimum of the Goodwin problem given by Eq. (9). The benefit of the hybrid can be appreciated by comparing the left panel (DE) with the right panel (hybrid). For box size 10 the hybrid converges almost ten times faster while for larger box sizes the asset is even more pronounced. This is also reflected by the CPU times presented in Table

Computational costs in the Goodwin case study (in seconds) for 0% and 10% noise to signal ratio, respectively.

Simulated data with 0%/10% noise

Box Size

SS

MS

DE

Hybrid

5

213/409

907/1153

108/104

13/12

10

326/423

1340/1443

972/846

16/14

100

453/472

733/1021

1320/1370

30/26

The CPU time is normalized using the Linpack benchmark table and is in case of the multistarts of the single shooting (SS) and multiple-shooting (MS) method the sum over all restarts. As in the case of the STAT5 example the improved robustness of multiple-shooting gives rise to increased computational cost compared to single shooting. The benefit of the hybrid becomes evident by the fact that the computational cost of the hybrid is about 8 times lower than DE for the Box 5, 60 times lower for the Box 10 and around 40 for Box 100.

Conclusion

In this study we present a new hybrid strategy as a reliable method for solving challenging parameter estimation problems encountered in systems biology. The proposed method presents two advantages over previous hybrid methods: First, it is equipped with a switching strategy which allows the systematic determination of the transition from the local to global search. This avoids computationally expensive tests in advance and constitutes a major benefit of the proposed method. Second, using multiple-shooting as the local search procedure reduces the multi-modality of the non-linear optimization problem. Because multiple-shooting avoids possible spurious solutions in the vicinity of the global optimum it outmatches the initial value approach (single shooting) yielding an enhanced robustness of the hybrid.

We analyzed the performance of this new approach using two examples: the dynamical model of the STAT5 signaling pathway suggested in

Authors' contributions

C.F. initiated the work, M.P. implemented the multiple-shooting algorithm. M.P. and E.B.C. implemented the hybrid algorithm. E.B.C. performed the simulations. E.B.C., C.F., and M.P. drafted the manuscript. J.T. and J.B. proposed the main idea, gave valuable advises and helped to draft the manuscript. All authors read and approved the final manuscript.

Comparison of the multistart of the generalized-quasi-Newton within single and multiple-shooting for the JAK/STAT5 pathway

Comparison of the multistart of the generalized-quasi-Newton within single and multiple-shooting for the JAK/STAT5 pathway. Shown is the percentage of convergence to the global minimum, local minima or failure of the optimisation method using 100 restarts. The initial guess of each restart is randomly chosen from interval [0, 5] (Box 5), [0, 10] (Box 10), and [0, 100] (Box 100) using a uniform distribution. **a**) Noise-to-signal ratio is zero. As anticipated, multiple-shooting (right panel) performs better than single shooting (left panel). **b**) Same as in a), but using a noise-to-signal ratio of 10%.

Comparison of the multistart of the generalized-quasi-Newton within single and multiple-shooting for the Goodwin model

Comparison of the multistart of the generalized-quasi-Newton within single and multiple-shooting for the Goodwin model. Shown is the percentage of convergence to the global minimum, local minima or failure of the optimisation method using 100 restarts. The initial guess of each restart is randomly chosen from interval [0, 100] (Box 100), and [0, 1000] (Box 1000) using a uniform distribution. **a**) Noise-to-signal ratio is zero. **b**) Same as in a), but using a noise-to-signal ratio of 10%.

Convergence curves for the DE and the hybrid method to the global optimum of the Goodwin problem given by Eqs

Convergence curves for the DE and the hybrid method to the global optimum of the Goodwin problem given by Eqs. (9). **a**) 0% noise to signal ratio. The left figure shows the value of the objective function as a function of CPU time for different box sizes. CPU time is normalied using the Linpack Benchmark table. The right figure displays the convergence of the hybrid strategy. **b**) Same as in a) but with 10% noise to signal ratio. The difference between the final value of the cost function in **a**) and **b**) is due to the added noise.

Acknowledgements

This work was supported by the European Community as part of the FP6 COSBICS Project (STREP FP6-512060), the German Federal Ministry of Education and Research, BMBF-project FRISYS (grant 0313921) and Xunta de Galicia (PGIDIT05PXIC40201PM).