Computer Systems Department, Jožef Stefan Institute, Jamova cesta 39, SI-1000 Ljubljana, Slovenia

Faculty of Administration, University of Ljubljana, Gosarjeva ulica 5, SI-1000 Ljubljana, Slovenia

Department of Knowledge Technologies, Jožef Stefan Institute, Jamova cesta 39, SI-1000 Ljubljana, Slovenia

Abstract

Background

We address the task of parameter estimation in models of the dynamics of biological systems based on ordinary differential equations (ODEs) from measured data, where the models are typically non-linear and have many parameters, the measurements are imperfect due to noise, and the studied system can often be only partially observed. A representative task is to estimate the parameters in a model of the dynamics of endocytosis, i.e., endosome maturation, reflected in a cut-out switch transition between the Rab5 and Rab7 domain protein concentrations, from experimental measurements of these concentrations. The general parameter estimation task and the specific instance considered here are challenging optimization problems, calling for the use of advanced meta-heuristic optimization methods, such as evolutionary or swarm-based methods.

Results

We apply three global-search meta-heuristic algorithms for numerical optimization, i.e., differential ant-stigmergy algorithm (DASA), particle-swarm optimization (PSO), and differential evolution (DE), as well as a local-search derivative-based algorithm 717 (A717) to the task of estimating parameters in ODEs. We evaluate their performance on the considered representative task along a number of metrics, including the quality of reconstructing the system output and the complete dynamics, as well as the speed of convergence, both on real-experimental data and on artificial pseudo-experimental data with varying amounts of noise. We compare the four optimization methods under a range of observation scenarios, where data of different completeness and accuracy of interpretation are given as input.

Conclusions

Overall, the global meta-heuristic methods (DASA, PSO, and DE) clearly and significantly outperform the local derivative-based method (A717). Among the three meta-heuristics, differential evolution (DE) performs best in terms of the objective function, i.e., reconstructing the output, and in terms of convergence. These results hold for both real and artificial data, for all observability scenarios considered, and for all amounts of noise added to the artificial data. In sum, the meta-heuristic methods considered are suitable for estimating the parameters in the ODE model of the dynamics of endocytosis under a range of conditions: With the model and conditions being representative of parameter estimation tasks in ODE models of biochemical systems, our results clearly highlight the promise of bio-inspired meta-heuristic methods for parameter estimation in dynamic system models within system biology.

Background

Reconstructing the structure and behavior of biological systems is of fundamental importance to the field of system biology. In general, biological systems exhibit complex nonlinear dynamic behavior, which is often modeled using ordinary differential equations (ODEs). A common approach to constructing an ODE model of an observed biological system is to decompose the modeling process in two tasks

Due to the highly nonlinear dynamics and the limited measurability of biological systems, the parameter estimation task is challenging and computationally expensive. Most parameter estimation tasks in system biology are multi-modal, i.e., have many local optima that prohibit the use of local search methods. Furthermore, the models are often high-dimensional, making the parameter estimation task computationally complex. Finally, the measurability of systems in cell and molecular biology is highly limited. Many system variables are not directly observable. For the few ones that can be measured, measured data are noisy and taken at a coarse time resolution. All these constraints, combined with the complex dynamic of the considered models, can lead to identifiability problems, i.e, the impossibility of unique estimation of the unknown model parameters, making the parameter estimation an even harder optimization task

There are two broad classes of approaches to the parameter estimation task: the

Representative methods from both classes of approaches are commonly used for parameter estimation in the field of system biology

Related work using least-squares methods for parameter estimation in system biology

We address the task of estimating the parameters of a nonlinear ODE model of endocytosis, more specifically of the maturation of endosomes, which are membrane-bound intracellular compartments used to transport and disintegrate external cargo. The model focuses on a key endocytotic regulatory system that switches from cargo transport in early endosomes to cargo disintegration in mature endosomes

In this paper, we study the effect of this kind of limited observability of the system dynamics on the complexity of the parameter estimation task, as well as the applicability and performance of four different optimization methods in this context. In order to do so, we define four different observation scenarios and generate artificial (pseudo-experimental) data for each of them. The scenarios cover a wide range of situations, from the simplest one of complete observability, where the concentrations of all protein states are assumed to be directly measurable, to the most complex (and realistic) scenario, where the observations are limited to the total concentrations of proteins in all their different states. We test the performance of the selected optimization methods in the different observation scenarios and compare the ability of different methods to cope with them. A final set of experiments, based on real-experimental data, are performed in order to check the validity of the results obtained on artificial data. More specifically, we test the methods' performance on measured data obtained through real-world biological experiments that corresponds to the most complex observation scenario described above.

Our study includes four optimization methods: the differential ant-stigmergy algorithm (DASA), our own, recently developed meta-heuristic method for global optimization

Parameter estimation in ODE models

The task of parameter estimation in ODE models can be formalized as follows. Given a model structure _{1}, ..., _{D}
^{opt }of

Nonlinear least-squares estimation

Among different suggested objective functions measuring the goodness of fit, the maximum-likelihood estimator ^{opt}) is chosen. The likelihood function depends on the probability of the measurements in _{i }

where _{i}
^{th }measured output at the ^{th }time point, ^{th }output at the ^{th }time point, predicted by the model

ODE models and observability

A model based on ODEs defines the temporal changes of a set of

where _{0}, _{0}) and the values of the exogenous variables _{0}, _{
N-1}], can be simulated to obtain the values of the system variables _{0}, _{
N-1}].

An analytical solution for complex nonlinear ODE integration problems does not exist in general: One has to apply numerical approximation methods for ODE integration. To this end, we use the CVODE package, a general-purpose ODE solver that uses the adaptive-step Adams-Moulton and backward differentiation method for integration

Note that the ODE model captures the behavior of the system variables

In the simplest observation scenario, the values of all the system variables are directly observed (measured), i.e.,

Methods

Optimization methods

This section describes the optimization approaches used to solve the nonlinear parameter estimation task for the Rab5-to-Rab7 conversion model. We address the task using a recently-developed swarm-based meta-heuristic differential ant-stigmergy algorithm (DASA), motivated by the fact that DASA has shown promising results in solving large scale continuous global optimization problems, but has not been applied to the challenging task of parameter estimation in nonlinear ODE models. In addition, we use two well established meta-heuristics for global optimization, i.e., particle swarm optimization (PSO) and differential evolution (DE), as well as the derivative-based algorithm 717 (A717), essentially designed for nonlinear least-squares estimation. Below we provide a description of each of the four methods. We also specify the specific parameter settings in all methods as used in our experimental evaluation. The used parameter settings were selected by Sobol'-sampling-based parameter tuning

The differential ant-stigmergy algorithm

The differential ant-stigmergy algorithm (DASA) was initially proposed in 2006 by Korošec

First, the DASA approach transforms the

Second, DASA performs pheromone-based search that involves best-solution-dependent pheromone distribution. The amount of pheromone is distributed over the graph vertices according to the Cauchy probability density function (PDF) _{+}, the global scale decrease factor _{-}, and the pheromone evaporation factor _{+}, _{-},

The main loop of the DASA method, visualized in Figure ^{2 }for all ants) all the ants only find paths composed of zero-valued offsets, the search process is restarted by randomly selecting a new temporary-best solution and reinitializing the pheromone distributions. Related to this, DASA keeps information about a globally best solution, called global-best solution. This solution is the best over all restarted searches, while the temporary-best solution is the best solution found within one search (restart).

The differential ant-stigmergy algorithm (DASA)

**The differential ant-stigmergy algorithm (DASA)**. High-level block-diagram representation of the DASA method.

Particle swarm optimization

Particle swarm optimization (PSO) is a stochastic population-based optimization technique developed by Eberhart and Kennedy in 1995

The basic PSO method initializes the swarm with _{p}
_{n}
_{1}
_{1}(_{p}
_{2}
_{2}(_{n}
_{1 }and _{2}, called acceleration coefficients, are positive real values that balance the influence of the cognitive and the social component, while _{1 }and _{2 }are random factors uniformly sampled from the unit interval that introduce a stochastic component in the search.

The particular version of PSO used in our experimental evaluation is a standard variation of the basic PSO (the implementation is available online

Differential evolution

Differential evolution (DE) is a simple and efficient population-based heuristic for optimizing real-valued multi-modal functions, introduced by Storn and Price in the 1990s

The main difference between traditional EA and DE is in the reproduction step, where for every candidate individual _{c }
_{1}, _{2}, _{3}) selected at random or by quality, based on one difference vector, i.e., _{1 }+ _{2 }- _{3}). The rate at which the population evolves can be controlled by a scale (mutation) factor _{c }
_{best }
_{c}

The implementation used in our experimental evaluation is based on the implementation of the DE algorithm described in the technical report by Storn and Price

Algorithm 717

Algorithm 717 (A717) is a set of modules for solving the parameter estimation problem in nonlinear regression models like nonlinear least-squares, maximum-likelihood and some robust fitting problems

In order to promote convergence from poor starting guesses, the algorithm implements the idea of having a local quadratic model _{i }
_{i }
_{i }
_{i }
_{
i+1}, or the next trial step, is chosen to approximately minimize _{i }
_{
i+1 }is used for model updating and also to resize and reshape the trust-region.

Among the modules, we can chose the ones for unconstrained optimization, or the ones that use simple bound constraints on the parameters. Furthermore, we can choose between modules that involve approximate computation of the needed derivatives by finite differences, and modules that expect the derivatives of the objective function to be provided by the routine that calls them.

In this work, we used the original implementation of A717 as available online

Parameter settings

In the text above, we described the optimization methods that will be used for parameter estimation in the endocytosis model. Among these, the meta-heuristic approaches have many parameters that guide the search and consequently influence the methods' performance. To obtain the best possible performance on a given problem, one should consider a task specific tuning of the parameter setting for the optimization method used (see, e.g., the study by Daeger

There are two common approaches for choosing parameters values

A detailed discussion and survey of parameter tuning methods is given by Eiben and Smit

In this paper, parameter tuning for the meta-heuristic optimization methods was performed with a sampling method based on Sobol' sequences, introduced by Sobol' in 1967

The DE method has only four parameters, while DASA and PSO have more: Consequently we chose only four parameters per single method to be tuned. For DASA, we chose the three real-valued parameters that directly influence the search heuristic (_{+}, _{-}, and

The parameters of the three meta-heuristics methods chosen for Sobol'-sampling-based parameter tuning and their ranges are summarized in Table

Setup and results of Sobol'-sampling-based parameter tuning of optimization methods.

**DASA**

**PSO**

**DE**

**Parameter**

**
m
**

**
Ρ
**

**
S
**

**
K
**

**
w
**

**
c
**

**
P
**

**
ST R
**

**
F
**

**
CR
**

Lower

4

0

0

0

4

1

0

1

6

1

0

0

Upper

200

1

1

200

1

4

200

10

2

1

Tuned

144

0.036

0.573

0.01

155

89

0.762

1.037

81

8

0.942

0.915

The table includes the search ranges (their lower and upper bound) for each of the four parameters of each of the three different meta-heuristic optimization methods that were tuned. We used the Sobol' sampling procedure with the number of sampling points set to 2000. The resulting vector of method's parameters was chosen as the one that showed best median performances (according to the SSE metric) in the multiple-run experiments among the 2000 sampled parameter settings. A single experiment included eight runs, each performed with half a million of objective function evaluations. The parameter tuning was performed on the complete observation scenario using noise-free artificial data.

DASA setup

The discretization base is set to 10, the maximum parameter precision is set to 10^{-15}, the number of ants is set to 144, the global scale increase factor to 0.575, the global scale decrease factor to 0.01, and the pheromone evaporation factor to 0.036.

PSO setup

A variable random topology was chosen, the particle swarm size was set to 155, the neighborhood size to 89, the inertia weight to 0.762, and the acceleration coefficient to 1.037. In addition, default settings were used for the remaining parameters related to advanced options not included in the standard PSO method.

DE setup

The chosen strategy was "DE/rand-to-best/1/bin", the population size was set to 81, the weight factor to 0.915, and the crossover factor to 0.942.

Comparison methodology

To guarantee a fair comparison of the three optimization methods, we ran each method 25 times allowing half a million of evaluations of the objective function per single run. We used a number of performance evaluation metrics to compare the utility of the three optimization methods for parameter estimation; the reported method performance is the average/median performance over all 25 runs. While the first quality measure is about the convergence rate of the optimization

The division by the number of data points and square root in RMSE make its measurement units and scale comparable to the ones of the observed output variables. This is in contrast with the SSE measure defined with Eq. (1). Finally, note that better models have smaller values of RMSE.

As defined above, the RMSE quality metric measures the degree-of-fit between simulated model output and observed system output. However, reconstruction of system dynamics goes beyond reconstructing output; ultimately, modeling is about capturing the complete (also unobserved) system dynamics. To measure this aspect of reconstruction quality, we have to measure the degree-of-fit between simulated and observed values of the system variables. Although this is impossible in real cases where system variables can not be directly observed, experiments with artificial data allow us to measure this aspect of model quality. In this context, we use an additional model quality metric when comparing the methods in the case of artificial data.

The

where _{i}
^{th }and 75^{th }percentiles of the sample, respectively; in consequence, the box height corresponds to the

_{i }
_{m }
_{m }
_{m }
_{0 }and calculate the values

where _{tp }
_{i }
_{i }
_{i}
_{i}

Practical parameter identifiability

The problem of uniqueness of the estimated parameters in a given model is related to the issue of parameter identifiability. We can distinguish between structural and practical identifiability. Structural identifiability is a theoretical property of the model structure, depending only on the model input (stimulation function) and output (observation function): It is not related to the specific values of the model parameters. The parameters of a given model are structurally globally identifiable, if they can be uniquely estimated from the designed experiment under the ideal conditions of noise-free observations and error-free model structure

Even when we deal with a structurally identifiable model, it can still happen that the parameters can not be uniquely identified from the available experimental data. In this case, we experience a practical identifiability problem, related to the amount and quality of available experimental data. Practical identifiability analysis can also help us to assess the uncertainty of the parameter estimates and to compare possible experimental designs without performing experiments. Parameter uncertainties (confidence intervals) may be computed by using the Fisher information matrix (FIM) or a Monte Carlo-based approach. Details and further references on this topic are given by Balsa-Canto and Banga

We are going to assess the practical identifiability of the parameters in the endocytosis model using the Monte Carlo-based sampling approach _{noisy }= ^{th }and 75^{th }percentiles of the sample, respectively. The detected outliers are removed: More precisely, the new (reduced) sample includes only the estimates obtained from those datasets that did not produce any outlier over all parameters. The distributions of the parameters are presented with histograms, including the corresponding 95% confidence intervals ^{th }and 97.5^{th }percentile of the sample. The width of the bins

where IQR is the interquartile range of the sample and _{s }

Based on the outlier-free samples of parameter estimates, the correlation of two model parameters _{i }
_{j }

where _{i }
_{j }

Results and Discussion

Endocytosis model

This work addresses the task of parameter estimation in a practically relevant model of endocytosis, i.e., the life-cycle of endosomes. Endosomes are membrane-bound intracellular components that typically encapsulate, transport, and disintegrate external cargo within cells. The model at hand focuses on the process of endosome maturation, representing it by a cut-out switch between the concentrations of Rab5 and Rab7 domain proteins

To model the Rab5-to-Rab7 conversion, we distinguish between active and inactive (passive) states of the Rab5 and Rab7 domain proteins. Thus, the ODE model involves four system (endogenous) variables corresponding to the concentrations of Rab5 domain proteins in inactive (_{5}) and active state (_{5}) and Rab7 domain proteins in inactive (_{7}) and active state (_{7}), measured in mol/l. These four species (chemical compounds) are involved in ten different biochemical reactions _{1}, ..., _{10 }parameterized with eighteen constant parameters _{1}, ..., _{18 }corresponding to the kinetic rates of the reactions, leading to the following structure of the model ODEs:

Here, _{1}, ..., _{10 }denote the kinetic models of the corresponding biochemical reactions, given below:

Note that all variables in the model are system variables, i.e., _{5}, _{5}, _{7}, _{7}} and there are no exogenous variables, i.e.,

Figure

Simulated behavior of the Rab5-to-Rab7 conversion model

**Simulated behavior of the Rab5-to-Rab7 conversion model**. Simulation of the cut-out switch model of the conversion of Rab5 domain proteins to the Rab7 domain proteins in the regulatory system of endocytosis as proposed by Del Conte-Zerial

and initial values of the state variables set to

as proposed by Del Conte-Zerial _{5 }and _{7 }follow the expected (rapid) cut-out switch from high Rab5 and low Rab7 to low Rab5 and high Rab7 concentrations, while the concentrations of the passive-state proteins _{5 }and _{7 }remain almost constant throughout the whole process, with a small but notable change at the transition point.

In sum, the task of parameter estimation in the Rab5-to-Rab7 cut-out switch model leads to a 22-dimensional continuous minimization problem with 18 dimensions corresponding to model parameters and four dimensions corresponding to the initial values of the four system variables. The objective function, _{i }
_{i }
_{5}(_{5 }(_{7 }(_{7 }(

In order to evaluate the performance of different parameter optimization methods on this task, we conducted experiments with artificial data, obtained by simulating the Rab5-to-Rab7 conversion model, and with real data from experimental measurements.

Data

Artificial (pseudo-experimental) data

We generated the artificial data by simulating the ODE model from Eqs. (9)-(12) at 2781 equally spaced time points inside the interval [0, 1551] seconds. To obtain more realistic artificial data, we added a normal Gaussian noise _{noisy }=

Measured (real-experimental) data

In the second set of experiments, we used the real time-course measurements from Del Conte-Zerial _{1}(_{5}(_{5}(_{2}(_{7}(_{7}(

Observation scenarios

The limited measurability of the system variables in the real-world measurement scenario, described above, represents one of the most challenging properties of the parameter estimation task addressed in this paper. To evaluate the impact that the limited observability has on the difficulty of the optimization task (and consequently on the performance of different optimization methods), we define here four observation scenarios, ranging from the simplest one that assumes that all the system variables can be directly measured to the most complex one that corresponds to the limitations of the real measurement process described in the previous paragraph.

Complete observation (CO)

In this scenario, we assume that all the system variables are directly observed, meaning that the measurement process can identify the four concentrations of active and inactive states of the Rab5 and Rab7 proteins at each time point, i.e., _{1}(_{5}(_{2}(_{5}(_{3}(_{7}(_{4}(_{7}(

Active-state protein concentration observation (AO)

Here, we assume that only concentrations of the active-state proteins can be observed, i.e., _{1}(_{5}(_{2}(_{7}(

Total protein concentration observation (TO)

This scenario represents the real measurement process outlined above, where _{1}(_{5}(_{5}(_{2}(_{7}(_{7}(

Neglecting passive-state protein concentration (NPO)

This is the scenario based on how the measurements are (visually) matched against model simulations by Del Conte-Zerial _{1}(_{5}(_{5}(_{2}(_{7}(_{7}(

Parameter estimation with artificial data

Given the artificial data described above (obtained using the reference values of the constant parameters from Eq. (11), we can calculate the value of the objective function at the reference point for each noise level and observation scenario: These are reported in Table

Values of the quality metrics for the reference model.

**Noise**

**Scenario**

**SSE**

**RMSE**

CO

0

0

0%

AO

0

0

TO

0

0

NPO

5549.839

1.413

CO

2.653

0.031

5%

AO

1.289

0.022

TO

2.591

0.031

NPO

5556.486

1.414

CO

42.447

0.124

20%

AO

20.627

0.086

TO

41.452

0.122

NPO

5607.516

1.420

The reference model was used for generation of the artificial data.

Let us now consider the RMSE performance of the four parameter estimation methods (DASA, PSO, DE, and A717) on the artificial datasets with three levels of noise (0%, 5%, and 20%) under the four observation scenarios (CO, AO, TO, and NPO). Figure

RMSE performance of the models obtained by parameter estimation from artificial data

**RMSE performance of the models obtained by parameter estimation from artificial data**. Boxplots of the performance distributions of the four optimization methods (DASA, PSO, DE, and A717) in terms of the quality of the reconstructed output (RMSE), when considering four different observation scenarios (columns CO, AO, TO, and NPO) and three artificial datasets (rows): a) noise-free,

The comparison among observation scenarios shows that the CO and AO scenarios are very similar: they induce an identical ranking of the optimization methods in terms of performance at all noise levels. The rankings are slightly different (but still very similar) in the case of TO, and quite different in the implausible scenario (see the discussion above) of NPO. As the noise level increases, the AO scenario seems to become an easier task than the CO scenario leading to much better optimum values of RMSE, while the CO scenario becomes very similar to the TO scenario. In the NPO case, all four optimization methods overfit the observed output, leading to values of the objective function that are smaller than the value at the reference point from Table

However, comparing the RMSEm performance, i.e., the quality of the complete model reconstruction, leads to much clearer conclusions about the relative difficulty of the four observation scenarios. Figure

RMSEm performance of the models obtained by parameter estimation from artificial data

**RMSEm performance of the models obtained by parameter estimation from artificial data**. Boxplots of the performance distributions of the four optimization methods (DASA, PSO, DE, and A717) in terms of the quality of the complete model reconstruction (RMSEm), when considering four different observation scenarios (columns CO, AO, TO, and NPO) and three artificial datasets (rows): a) noise-free,

The convergence curves in Figure

Convergence performance of the optimization methods on the task of parameter estimation from artificial data

**Convergence performance of the optimization methods on the task of parameter estimation from artificial data**. Convergence curves of the four parameter estimation methods (DASA, PSO, DE, and A717) applied to three artificial datasets (columns) and four observation scenarios (rows): a) CO; b) AO; c) TO; and d) NPO. Graphs in the left column correspond to the noise-free data set, while the graphs in the middle and right column correspond to the noisy datasets with 5% and 20% relative noise, respectively. In order to capture the convergence trend over a wide range of values, the convergence curves are plotted using logarithmic scales for both axes.

In order to assess the statistical significance of the differences in performance across all scenarios, two Holm tests were conducted using first the median values of RMSE and then the median values of RMSEm. The corresponding median values are given in Table

Results on RMSE and RMSEm of the models estimated from artificial data.

**Noise**

**Scenario**

**RMSE**

**RMSEm**

**DASA**

**PSO**

**DE**

**A717**

**DASA**

**PSO**

**DE**

**A717**

CO

0.0651

0.0527

**0.0189**

0.7005

0.0651

0.0527

**0.0189**

0.7005

0%

AO

0.0625

0.0539

**0.0250**

0.6099

1.6272

**0.7866**

1.7876

1.0684

TO

0.0951

0.1507

**0.0197**

0.6612

0.5857

0.4606

**0.2511**

0.7960

NPO

0.2993

0.5040

**0.2282**

0.6881

2.6840

**2.0717**

3.9246

3.0273

CO

0.1164

0.1121

**0.0999**

0.7287

0.1164

0.1121

**0.0999**

0.7287

5%

AO

0.0902

0.0861

**0.0690**

0.6232

1.0437

**0.9043**

1.4639

1.3442

TO

0.1363

0.1341

**0.1006**

0.6546

0.6162

**0.2750**

0.2831

0.9265

NPO

0.3162

0.5166

**0.2463**

0.6897

2.8668

3.8831

6.6315

**2.1172**

CO

0.3958

0.3941

**0.3907**

0.8113

0.3958

0.3941

**0.3907**

0.8113

20%

AO

0.2770

0.2760

**0.2707**

0.6782

1.7547

**1.0050**

2.8513

1.3052

TO

0.4023

0.3983

**0.3917**

0.7810

0.6967

0.4606

**0.4289**

0.9952

NPO

0.4929

0.6407

**0.4585**

0.8023

**2.1250**

2.5423

2.8333

2.1999

The table presents the median values of RMSE and RMSEm (over the 25 runs) of the models reconstructed with the parameters' estimates obtained by the three optimization methods from artificial data. The best values for both metrics are given in bold.

Results of the Holm test for significance level

**
i
**

**Method**

**RMSE**

**Method**

**RMSEm**

**
z**

**
p**

**Hypothesis**

**
z**

**
p**

**Hypothesis**

3

0.017

A717

5.69

1.25·10^{-8}

Rejected

A717

2.53

1.14·10^{-2}

Rejected

2

0.025

DASA

3.16

1.57·10^{-3}

Rejected

DASA

1.58

1.14·10^{-1}

Accepted

1

0.050

PSO

2.53

1.14·10^{-2}

Rejected

DE

1.58

1.14·10^{-1}

Accepted

The table summarizes the outcome of the Holm test performed on the median values of the RMSE and RMSEm results obtained by parameter estimation with the four optimization methods from artificial data. In the first test based on median values of the RMSE measure, DE is the reference method with rank _{i }

The RMSE and RMSEm values for the best model estimated from artificial data.

**Noise**

**Scenario**

**DASA**

**PSO**

**DE**

**A717**

**RMSE**

**RMSEm**

**RMSE**

**RMSEm**

**RMSE**

**RMSEm**

**RMSE**

**RMSEm**

CO

0.0345

0.0345

0.0430

0.0430

**0.0064**

0.6080

0.6080

0%

AO

0.0446

6.0913

0.0406

**0.0043**

1.1807

0.4644

21.3690

TO

0.0468

1.0964

0.0447

**0.0110**

0.1074

0.4542

0.6150

NPO

0.2382

2.5430

0.3198

**0.1774**

12.2287

0.6220

3.0273

CO

0.1064

0.1064

0.1072

0.1072

**0.0977**

0.5363

0.5362

5%

AO

0.0739

0.3343

0.0803

1.8387

**0.0678**

0.3570

0.3723

TO

0.1139

0.1096

0.1639

**0.0985**

0.5058

0.4007

0.9028

NPO

0.2562

3.0161

0.3349

**0.2163**

318.415

0.5189

1.9670

CO

0.3926

0.3926

0.3925

0.3925

**0.3904**

0.6490

0.6490

20%

AO

0.2742

1.3904

0.2735

**0.2704**

1.6916

0.4680

0.6220

TO

0.3948

0.4568

0.3955

0.4368

**0.3913**

0.5698

1.2933

NPO

0.4616

0.5055

2.6218

**0.4448**

5.4207

0.7556

7.2268

The table presents the RMSE and corresponding RMSEm values for the model simulated with the best parameters obtained by parameter estimation with the four optimization methods from artificial data. The best values for RMSE are marked in bold, while the best values for RMSEm are marked in italic.

Parameter estimation with measured data

Table

Results on RMSE of the models estimated from measured data.

Scenario

**DASA**

**PSO**

**DE**

**A717**

Best

0.0661

0.0752

**0.0599**

0.2482

Median

0.0744

0.2032

**0.0643**

0.2782

TO

Worst

0.1530

0.2045

**0.0682**

0.2898

Average

0.0782

0.1494

**0.0647**

0.2749

Std

0.0163

0.0627

**0.0029**

0.0124

Best

0.0665

0.0825

**0.0623**

0.2453

Median

0.0799

0.1942

**0.0649**

0.3964

NPO

Worst

0.1788

0.2338

**0.0698**

0.4920

Average

0.0924

0.1680

**0.0654**

0.3857

Std

0.0305

0.0471

**0.0019**

0.0724

The table presents the RMSE values associated with the predicted models (over 25 runs) obtained by parameter estimation with the four optimization methods from measured data. The best values regrading all statistics are given in bold.

RMSE performance of the models obtained by parameter estimation from measured data

**RMSE performance of the models obtained by parameter estimation from measured data**. Boxplots of the performance distributions of the four optimization methods (DASA, PSO, DE, and A717) in terms of the reconstructed output (RMSE), when considering two different observability scenarios (columns TO and NPO) and three datasets: a) measured data, b) artificial data with

The results on measured data confirm the findings of the experiments performed on artificial data. DE consistently leads to models with smallest RMSE (best performance), regardless of whether we consider the best, median, worst, or average RMSE (over the 25 runs). The boxplots clearly show the statistical significance of the performance differences between the four methods. DASA is the second best method, PSO is ranked third and A717 is ranked as the worst performing method. The observation about the higher variance of the RMSE values obtained by the PSO method in the TO and NPO scenarios with artificial data is confirmed in the experiments with measurement data. In the case of measured data, there is a very similar error distribution (range of values) in both scenarios (which is less expected given the definition of the scenarios), while in case the of artificial data, the NPO scenario is characterized with higher RMSE errors than the TO scenario. The error distribution in the measured data case is closer to the error distributions generated when considering artificial data with a noise level of 5%. Similarly, the convergence curves in Figure

Convergence performance of the optimization methods on the task of parameter estimation from measured data

**Convergence performance of the optimization methods on the task of parameter estimation from measured data**. Convergence curves of the four parameter estimation methods (DASA, PSO, DE, and A717) when considering two observation scenarios: a) TO and b) NPO. In order to capture the convergence trend over a wide range of values the convergence curves are plotted using logarithmic scales for both axes.

As a final test of the quality of the obtained models, we can visually compare the observed outputs with the outputs predicted by the models. In this context, Figure

Simulated behavior of the best models obtained by parameter estimation from measured data in the TO observation scenario

**Simulated behavior of the best models obtained by parameter estimation from measured data in the TO observation scenario**. Experimental (observed) vs. reconstructed output (left-hand side) and simulated behavior (right-hand side) of the model corresponding to the best parameters' values estimated from measured data in the TO observation scenario using: a) DASA and b) DE.

**Supplemental information**. This file contains Figures S1-S10 and Tables S1-S8 with results obtained from parameter estimation in the Rab-to-Rab7 conversion model. Experimental behavior vs. simulated behavior of the reconstructed output and reconstructed model dynamics with the best parameters estimated by DASA (Figure S1), PSO (Figure S2), DE (Figure S3), and A717 (Figure S4) using measured data. Relative errors of the best estimated parameters by DASA (Table S1), PSO (Table S2), DE (Table S3), and A717 (Table S4) using artificial data. Parameter values associated with the best solutions estimated using measured data (Table S5). Summary of results on the DE estimated parameters with the Monte Carlo-based approach using data with 20% noise in three observation scenarios: CO (Table S6), AO (Table S7), and TO (Table S8). Corresponding histograms of the DE estimated parameters with the Monte Carlo-based approach using data with 20% noise: CO (Figure S5), AO (Figure S6), and TO (Figure S7). Scatter plots of the Monte Carlo-based DE parameter estimates combined with contour plots of the objective function when considering data with 20% noise for the most correlated pairs of parameters in the CO (Figure S8), AO (Figure S9), and TO (Figure S10) observation scenarios.

Click here for file

The left-hand side graphs in additional file

Finally, the analysis of reconstructed model dynamics (complete simulation of all the system variables) of the obtained models reveals further details about their quality. Note first that the simulated behavior in Figure

Overall, the results on measured data show that all three meta-heuristic methods are far better than A717. Among the meta-heuristic methods, DE has a clear advantage over the other two methods, both in terms of the convergence rate and in terms of the reconstruction of model output. In terms of other relevant qualitative aspects of the behavior of the obtained models, i.e., the time point of the switch and the ratio between active- and passive-state protein concentrations, one of the other two methods (DASA) performs better then DE. However, note that these qualitative aspects have not been included in the objective function (SSE) used by the optimization methods: thus, we cannot objectively and fairly compare the methods along this dimension.

Parameter values and practical parameter identifiability

Table _{4}, _{8}, _{12}, _{15}, _{5}(0), and _{7}(0), the relative error of the other estimated parameters is over 100%. On measured data, we do not have reference values, but the comparison (additional file

Best estimated values of the model parameters obtained from artificial data with 20% noise.

**CO**

**TO**

**
c
**

**DASA**

**PSO**

**DE**

**A717**

**DASA**

**PSO**

**DE**

**A717**

_{1}

1

4.0000

1.4644

0.2226

1.6393

3.1593

1.5627

1.8293

0.2974

_{2}

0.3

3.7099

2.0786

1.1132

2.5748

3.7499

2.7324

4

2.5086

_{3}

0.1

0.1977

1.2612

0.1974

0.0229

3.3526

0.2064

0.2682

0.5823

_{4}

2.5

3.5412

0.4192

3.1208

3.3226

0.2007

1.5837

3.7871

0.1709

_{5}

1

3.9940

1.4613

0.2217

1.5411

3.1287

1.4623

1.7688

0.5845

_{6}

0.483

0.5165

3.0640

3.6860

1.3897

0.4074

1.8952

0.4713

1.9140

_{7}

0.21

3.9471

2.6526

0.1503

1.5383

1.7030

2.9345

3.4951

2.0874

_{8}

3

3.1843

1.5314

3.4591

1.6254

1.5254

1.6742

3.1784

3.5895

_{9}

0.1

0.1563

1.5057

0.0524

3.1257

1.6444

1.5321

0.9762

1.9652

_{10}

0.021

2.0757

1.8316

0.0645

2.4551

0.2091

2.1490

1.4581

1.1780

_{11}

1

1.8340

2.9039

2.2013

2.3769

2.5222

3.2725

1.9195

2.6830

_{12}

3

3.1572

2.2358

1.7009

2.8349

1.2553

1.5096

0.1557

1.2227

_{13}

0.31

4.0000

1.7187

0.9381

0.6126

4.0000

2.9568

3.4364

3.2812

_{14}

0.3

1.0661

1.3179

0.3833

2.2955

1.7539

1.1975

0.8110

2.4085

_{15}

3

2.3525

1.7764

3.9800

3.6281

2.1599

2.1684

3.5535

3.3994

_{16}

0.483

0.5178

3.0728

3.6981

1.2091

0.4224

2.0041

0.7261

2.7693

_{17}

0.06

1.8635

0.4696

0.4159

2.0548

0.8316

1.3081

1.7687

0.2836

_{18}

0.15

2.7213

1.0043

0.1087

0.4984

0.5992

1.0744

1.3643

0.6677

_{5}(0)

1.0

0.8750

0.9116

0.9122

0.9535

0.9957

0.4830

0.9239

1.4011

_{5}(0)

0.001

4.0E-07

0.0358

0.1194

1.0854

3.3E-07

0.2313

0

1.3742

_{7}(0)

1.0

0.8096

1.3352

0.7978

0.1557

1.0153

0.4473

0.7310

1.0320

_{7}(0)

0.001

1.2E-10

0.2444

3.4E-04

0.8451

0.0139

0.1696

0.2515

0.9143

Best parameters' values as estimated by the four optimization methods from artificial data with 20% relative noise in the CO and TO observation scenarios.

Evidently, many quite different sets of parameter values produce behaviors that resemble the reference model behavior, suggesting that the endocytosis modeling task, as many others in system biology, has parameter identifiability problems. Indeed, a systematic study of seventeen system biology models _{4}, _{8}, _{12}, _{14}, _{15}, _{5}(0), and _{7}(0), the relative error of the other estimated parameters is over 100%. This observation additionally re-confirms the statements in the previous paragraph obtained form the results regarding all optimization methods. In extreme cases, the relative errors are over 1000%, as it is the case with the parameters: _{10 }(in all scenarios), _{17 }(in CO and TO scenarios), _{5}(0) (in CO and TO scenarios), and _{7}(0) (in TO scenario). Furthermore, the calculated uncertainties (95% confidence interval) of the parameters are large, especially for _{7}, _{5}(0), and _{7}(0) over all scenarios; see additional file

Furthermore, the estimated values for many model parameters are evenly distributed across the parameter ranges; see the histograms of the distributions of the estimated parameter values in additional file _{1}, _{2}, _{5}, _{8}, _{11}, and _{13 }have very similar (almost) uniform distributions with higher concentration of the estimates on the bounds of the allowed range. We observe similar distributions for most of these parameters in the AO scenario (including _{5}(0), _{7}(0), _{6}, and _{16}) and in the TO scenario (including _{6}, _{15}, and _{16}) as well. For some parameters (like _{3}, _{12}, and _{13 }in the CO and AO scenario) the confidence interval does not include the reference value of the parameter, emphasizing the complexity of the optimization problem and the objective function. A closer look at the histograms reveals that some pairs of parameters have very similar (or almost identical) distributions of the estimates: This is in general the case with the (_{1}, _{5}) and (_{6}, _{16}) pairs of parameters. Note also how the distributions of the initial values of the system variables _{5}(0) and _{7}(0) differ among scenarios. In the case of complete observability, their values follow (almost) a Gaussian distribution around the reference value. In the TO scenario, most of the estimated initial values are in the neighborhood of the reference values even though the relative errors are higher than in the CO scenario. However, in the AO scenario, the distribution does not resemble a Gaussian; the values are spread all over the corresponding ranges, with higher concentrations at the ranges' limits and far from the reference values. Evidently, the lack of information on the concentration of passive-state proteins (_{5 }and _{7}) in the data worsens the problems related to parameter identifiability. The correlation matrices for the estimated parameter values, presented in Figure _{6}, _{16}), (_{1}, _{5}), (_{7}, _{18}), and (_{8}, _{9}) pairs of parameters, while the pairs (_{8}, _{9}), (_{8}, _{18}), and (_{2}, _{13}) have correlations in the range 0.84 < |_{8 }and _{9}, while in the TO scenario there are six such pairs: the pairs (_{6}, _{16}), (_{1}, _{5}), and (_{7}, _{18}) have almost prefect linear correlation _{5}(0), _{5}(0)), (_{2}, _{13}), and (_{8}, _{9}) pairs have a correlation in the range 0.81 < |_{1}, _{5}) and (_{6}, _{16}) pairs of parameters in Figure _{8}, _{9}) as well. While the above-mentioned examples of correlated parameters are related to the lack of practical identifiability, the plot for the _{7 }and _{18 }parameters on the right-hand side in Figure _{7 }in the considered search interval; we observe a very large flat region in the part of the space 0.5 < _{18 }< 4, where _{7 }can take any value and does not influence the objective function. A similar observation holds for the _{9 }and _{18 }parameters in Figure _{9 }parameter seems to be structurally non-identifiable. Finally, the right-hand side plot in Figure _{5}(0) and _{5}(0)).

Correlation matrices for the parameters' estimates obtained by DE from noisy data (** s = 20**%) in a Monte Carlo-based approach

**Correlation matrices for the parameters' estimates obtained by DE from noisy data (**. Colored matrix cells visualize the correlation

Contour plots of the objective function with scatter plots of the parameters' estimates obtained by DE from noisy data (

**Contour plots of the objective function with scatter plots of the parameters' estimates obtained by DE from noisy data ( s = 20%) in a Monte Carlo-based approach**. The plots correspond to two representative pairs of correlated parameters in the observation scenarios: a) CO; b) AO; and c) TO. Note that one pair of correlated parameters in the TO observation scenario corresponds to the initial values of the Rab5 protein. The green dot represents the reference parameter value from Eqs. (11) and (12). The red dots are the parameters' estimates obtained by the DE method with the Monte Carlo-based approach.

Conclusions

In this paper, we address the task of parameter estimation in models of the dynamics of biological systems as considered in the field of system biology. In this context, it is typical that the considered models are nonlinear (due to the nonlinearity of the behavior of the modeled systems), have many parameters (are of high dimensionality), the measurements are imperfect (due to measurement noise), the system can be only partially observed (leading to incomplete or misinterpreted measurements). These properties make parameter estimation a challenging optimization problem, calling for the use of advanced optimization methods.

The focus of this paper is the use of meta-heuristic optimization methods for parameter estimation in dynamic system models typical of systems biology. We conduct an extensive experimental comparison of four optimization methods: the differential ant-stigmergy algorithm (DASA), particle-swarm optimization (PSO), and differential evolution (DE), all from the same class of meta-heuristic methods, as well as a local-search derivative-based method (A717). We compare these four methods as applied to a parameter estimation problem representative of the target class of problems described above. We use a a practically relevant model of endocytosis that captures the nonlinear dynamics of endosome maturation reflected in a cut-out switch transition between the Rab5 and Rab7 domain protein concentrations. The model is nonlinear and has many parameters. We compare the performance of the four optimization methods on this task along a number of dimensions, including the quality of reconstructing the observed system output (the measured quantities) and the complete model dynamics (all system variables, including unobserved ones), as well as the speed of convergence. Comparisons are made under different observation scenarios (full observability and different types of partial observability). We use both real (measured) data, containing partial observations of the system, and pseudo-experimental (artificial) data obtained by simulating the model and adding different amounts of artificial noise: The use of pseudo-experimental data allows us a more controlled study of the influence of noise and observability on the performance of the parameter estimation (optimization methods).

Noise in the measurements does influence the performance of the optimization methods, with higher amounts of noise making the task more difficult. The observability of the system (as varied through the observation scenarios), has a much stronger influence, where less complete observations make the optimization task much more difficult. Worst results are obtained when the observations are misinterpreted, i.e., when the actual total concentrations of Rab5 and Rab7 are taken to represent the concentrations of these proteins in their active states.

We also investigate the practical identifiability of the model parameters: Like many similar tasks in systems biology, the task considered has parameter identifiability problems. These are manifested by high relative errors of the reconstructed parameter values, spread uniform-like distributions of some parameter estimates, and strong correlations between some pairs of estimated parameters. The problems are present in all observation scenarios and are most severe in the case of incomplete observations. The performance of all three meta-heuristic methods is affected by these problems. On the other hand, this explains the severe difficulties that the local search method (A717) experienced on the given parameter estimation task. Overall, the global meta-heuristic methods (DASA, PSO, and DE) clearly and significantly outperform the local derivative-based method (A717). Among the three meta-heuristics, differential evolution (DE) performs best in terms of the objective function, i.e., the quality of reconstructing the expected output, and in terms of the speed of convergence. These results hold for both real and artificial data, for all observability scenarios considered, and for all amounts of noise added to the artificial data. In terms of the quality of reconstructing the complete model dynamics and other qualitative aspects of the behavior of the obtained models, the different meta-heuristic methods exhibit different behavior and relative performance under different conditions: More work needs to be done to better understand and objectively evaluate these differences in performance.

Further work is needed to confirm and strengthen the conclusions drawn from the experimental evaluation presented in this paper, primarily in the direction of conducting additional experiments. On one hand, we need to test the optimization methods on other tasks of parameter estimation in nonlinear models of biochemical kinetics. On the other hand, we can extend the set of optimization methods applied to the parameter estimation tasks, considering other state-of-the-art algorithms used for parameter estimation in the domain of computational systems biology

Last, but not least, we need to formalize relevant qualitative aspects of model quality (such as the time point of switch between the observed Rab5 and Rab7 concentrations in the endocytosis model) and include these in the formulation of the optimization problem of parameter estimation. These aspects will typically depend on domain knowledge about the particular problem at hand and can be made a part of the overall objective function or formulated as a separate objective function in a multi-objective optimization setting. This will allow us to objectively and fairly evaluate and compare the different optimization approaches from these aspects.

In sum, the bio-inspired meta-heuristic optimization methods considered are suitable for estimating the parameters in the ODE model of the dynamics of endocytosis under a range of conditions. The model considered, as well as the observational conditions (such as partial observability and noise) are representative of parameter estimation tasks in ODE models of biochemical network dynamics. Thus, our results point out and clearly highlight the promise of bio-inspired meta-heuristic methods for solving problems of parameter estimation in models of dynamic systems from the area of system biology.

Authors' contributions

SD and JŠ initiated the work, PK and KT implemented and adapted the optimization algorithms. KT performed the experimental evaluation of the algorithms and drafted the manuscript. LT, SD, and JŠ gave KT valuable advice on a variety of issues related to the manuscript revisions. All authors read and approved the final manuscript.

Acknowledgements

The real experimental data (total protein concentrations of Rab5 and Rab7) come from the group of Marino Zerial at the Max-Planck Institute for Cell Biology and Genetics in Dresden, Germany. Thanks to Perla Del Conte-Zerial and her coworkers for providing these data and the cut-out switch model used in this study. We would especially like to to thank Yannis Kalaidzidis for the invaluable discussions and feedback on the preliminary experimental results. Finally, the first author would like to acknowledge the support of the Slovenian Research Agency through the young researcher grant No. 1000-06-310026.