Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan

Institute of Systems Biology, Shanghai University, Shangda Road 99, Shanghai 200444, China

Laboratory for Bioinformatics, Graduate School of Systems Life Sciences, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, Fukuoka 812-8581, Japan

Department of Bioengineering, Graduate School of Engineering, The University of Tokyo, Tokyo 113-8656, Japan

Department of Mechanical Science and Bioengineering, Graduate School of Engineering Science, Osaka University, Osaka 560-8531, Japan

Abstract

Background

The investigation of network dynamics is a major issue in systems and synthetic biology. One of the essential steps in a dynamics investigation is the parameter estimation in the model that expresses biological phenomena. Indeed, various techniques for parameter optimization have been devised and implemented in both free and commercial software. While the computational time for parameter estimation has been greatly reduced, due to improvements in calculation algorithms and the advent of high performance computers, the accuracy of parameter estimation has not been addressed.

Results

We propose a new approach for parameter optimization by using differential elimination, to estimate kinetic parameter values with a high degree of accuracy. First, we utilize differential elimination, which is an algebraic approach for rewriting a system of differential equations into another equivalent system, to derive the constraints between kinetic parameters from differential equations. Second, we estimate the kinetic parameters introducing these constraints into an objective function, in addition to the error function of the square difference between the measured and estimated data, in the standard parameter optimization method. To evaluate the ability of our method, we performed a simulation study by using the objective function with and without the newly developed constraints: the parameters in two models of linear and non-linear equations, under the assumption that only one molecule in each model can be measured, were estimated by using a genetic algorithm (GA) and particle swarm optimization (PSO). As a result, the introduction of new constraints was dramatically effective: the GA and PSO with new constraints could successfully estimate the kinetic parameters in the simulated models, with a high degree of accuracy, while the conventional GA and PSO methods without them frequently failed.

Conclusions

The introduction of new constraints in an objective function by using differential elimination resulted in the drastic improvement of the estimation accuracy in parameter optimization methods. The performance of our approach was illustrated by simulations of the parameter optimization for two models of linear and non-linear equations, which included unmeasured molecules, by two types of optimization techniques. As a result, our method is a promising development in parameter optimization.

Background

The investigation of network dynamics is a major issue in systems and synthetic biology

Boulier and his colleagues developed differential elimination

Here, we propose a new method for optimizing the parameters, by using differential elimination

Results

We first describe a perspective of our method, and then the two models are analyzed to illustrate its performance. The two models were chosen from representative kinetic models for biological phenomena at the molecular level: one model (Model 1) is composed of two variables, analogous to molecular binding and dissociation, such as affinity binding in an antibody cross-link, and the other model (Model 2) is composed of four variables, analogous to a molecular reaction cascade, such as phosphorylation in signal transduction. Notably, we assumed that only one variable is measured among the variables in the two models.

Overview of present method

The key point of this study is the introduction of new constraints obtained by differential elimination into the objective function, to improve the parameter accuracy. Following an explanation of differential elimination, the method of introducing the constraints is briefly described.

Differential algebra aims at studying differential equations from a purely algebraic point of view

Assume a model of two variables, _{1} and _{2}, in Fig.

Example model

**Example model** Two molecules bind according to Michaelis-Menten kinetics, and only one molecule, _{1}, can be measured.

where _{12}, _{21}, _{e} and _{e} are some constants. Here, two molecules are assumed to bind according to Michaelis-Menten kinetics. The differential elimination then produces the following two equations equivalent to the above system.

When we define the left sides of the above system as _{1,t} and _{2,t}, _{2,t} is composed of _{1}, its derivatives, and the parameters obtained by eliminating _{2}, and _{1,t} is composed of _{1}, its derivatives, the parameters and _{2}. Note that _{2} in _{1,t} can be expressed by _{1}, its derivatives and the parameters in _{2,t}. Then, the values of _{1,t} and _{2,t} can be calculated, if we have time-series data of _{1}, and they would be zero, if all parameters were exactly estimated. Thus, _{1,t} and _{2,t} can be regarded as a kind of error function that expresses the difference between the measured and estimated data.

In general, the typical objective function for evaluating the reproducibility of an experimentally measured time-series for a parameter set is the total relative error,

where

Model 1

We analyzed a network model for the binding and dissociation of two molecules (Figure _{AB}, was generated (Figure

Model 1: binding and dissociation.

**Model 1: binding and dissociation.** The molecular binding and dissociation of two molecules is schematically shown (A). According to the kinetics of the model (see details in Methods), a reference curve of one variable, _{AB}, was generated for 0≦_{A}(0) = 10.0, _{B}(0) = 20.0, _{AB}(0) = 0.0, _{p} = 0.05, _{m} = 0.5, and _{c}=5.0 (B).

Reference data for Models 1 and 2

According to the kinetics of the models for Models 1 and 2, the reference data of one variable, _{AB} (A), and that of one variable, _{1} (B), were generated under the same conditions as those in Figures 2 and 5.

Click here for file

Overall, the introduction of DE constraints into the objective function was highly effective for correctly estimating the parameter values in both GA and PSO (Figure _{p}
_{p}
_{p}
_{m}
_{p}
_{m}
_{m}
_{p}

Estimated parameter values for Model 1

**Estimated parameter values for Model 1** The parameter sets are estimated by using the genetic algorithm (GA) (A) and the particle swarm optimization (PSO) (B), and in each figure, the histograms of parameter sets with and without DE constraints (right and left sides, respectively) are shown. The bin of the histogram indicates the fraction of the number of parameters within a range (0.01 for _{p}_{m}

Figure

Scatter plot of estimated parameter sets for Model 1

**Scatter plot of estimated parameter sets for Model 1** The distributions of the parameter sets by GA and PSO are shown with and without DE constraints (right and left sides, respectively). The black circles indicates the given parameter sets (_{p}_{m}

Model 2

We analyzed a network model for the molecular cascade reaction of four molecules (Figure _{1}, was generated (Figure

Model 2: cascade reaction

**Model 2: cascade reaction** The molecular cascade reaction of four molecules is schematically shown (A). According to the kinetics in the model (see details in Methods), a reference curve of one variable, _{1}, was generated for 0≦_{1}(0) = 10.0, _{2}(0) = 130.0, _{3}(0) =80.0, _{4}(0)=170.0, _{21}=5.0, _{31}=7.0, _{41}=11.0, _{p}_{2} = 3.0, _{p}_{3}=4.0, _{p}_{4}=10.0, _{e}_{1}=5.0 and _{e}_{2}=3.0 (B).

DE constraints for Model 1

The equivalent equations for Model 1 were derived from the system of differential equations by differential elimination.

Click here for file

In Model 2, the introduction of DE constraints into the objective function was also highly effective for correctly estimating the parameter values in both GA and PSO (Figure _{31} and _{41} failed without the introduction (left side). By using PSO (Figure _{41} failed without the introduction (left side). Furthermore, the features of the distribution forms of the estimated values were similar to those in Model 1 (Figure

Estimated parameter sets for Model 2

**Estimated parameter sets for Model 2** The parameter sets are estimated by using GA (A) and PSO (B), and in each figure, the histograms of parameter sets with and without DE constraints (right and left sides, respectively) are shown. The bin of the histogram indicates the fraction of the number of parameters within a range (1.0 for all parameters) to the total number of trial successes (200 for all cases) (see details in Methods).

The contraction of parameter space with the introduction of DE constraints into the objective function is shown more clearly in Figure

Scatter plot of estimated parameter sets for Model 2

**Scatter plot of estimated parameter sets for Model 2** The distributions of the parameter sets by GA and PSO are shown with and without DE constraints (right and left sides, respectively). The black circles indicate the given parameter sets (_{21}=5.0, _{31}=7.0, and _{41}=11.0).

Discussion

The introduction of DE constraints into the objective function clearly improved the parameter accuracy. Indeed, the parameter value sets were correctly estimated by the introduction of DE constraint into the objective function, while they were falsely estimated without the introduction. Furthermore, the parameter sets with the introduction were sharply distributed near the correct values in all cases, in contrast to the wide distribution without the introduction. In general, the derivatives included the information on the curve form of the measured time-series data, such as slope, extremal point and inflection point. This indicates that the new objective function estimates the difference of not only the values but also the forms between the measured and estimated data, while the standard objective function estimates only the value difference. Note that the DE constraint is rationally reduced from the original system of differential equations for a given model in a mathematical sense. Thus, our approach is expected to be a general approach in parameter optimization for improving the parameter accuracy.

To further test the performance of the present constraints in more realistic situations, we estimated the same parameters sets in Models 1 and 2 in the case of the simulated data with noise (see Methods). The reference curves for Models 1 and 2 were generated (Additional file

Scatter plot of estimated parameter sets for the simulation data with noise for Models 1 and 2

**Scatter plot of estimated parameter sets for the simulation data with noise for Models 1 and 2** The distributions of the parameter sets by GA for the data with noise (see details in Methods and Additional file

As expected, the new objective function requires more computational time, in comparison with an objective function with only a standard error function, due to the increase of the functions in DE constraints. Indeed, the computational time of our method was larger than that of the standard method in Models 1 and 2; the computational times for the standard method and our method were 0.4 and 2.3 hours in Model 1, and 0.03 and 0.22 hours in Model 2 (32 CPU’s of Intel(R) Xeon(R) X5550 2.67GHz). In addition to the computational time, a pitfall of our method is the equation size of DE constraints. In the equivalent systems, the number of terms frequently increases (see Additional file

DE constraints for Model 2

The equivalent equations for Model 2 were derived from the system of differential equations by differential elimination.

Click here for file

Furthermore, more local minima in the objective function appeared by introducing the DE constraints, also due to the increase in the functions. Indeed, the number of successful estimations by GA in our method was less than that of the standard method in Model 1. To further survey the effects of the landscape of DE constraints on the parameter estimation, we performed parameter optimization by using a gradient method, the modified Powell method

Scatter plot of estimated parameter sets by the modified Powell method for Models 1 and 2

**Scatter plot of estimated parameter sets by the modified Powell method for Models 1 and 2** The distributions of the parameter sets by the modified Powell method are shown for Model 1 (upper) and Model 2 (lower), with and without DE constraints (right and left sides, respectively). The black circles indicate the given parameter sets for the two models.

One possible use of our method is its application to network inference without known structure. Since the present method is designed with the assumption of a known network structure, the application range of our method to network inference is naturally restricted. However, our method can select the most possible network structure among the networks with similar structures. Indeed, we designed a similar procedure for evaluating the network structures with measured data

Various models for describing biological phenomena are available

Conclusions

The introduction of the constraints by using differential elimination was effectively improved the parameter accuracy in two models of linear and nonlinear equations, especially when we assumed that unmeasured variables were included, by two optimization techniques. This clearly indicates that the ability of our method for estimating the parameter values was far superior to that of various methods with the standard error function. Although the present study focused on two simple models, our method is a feasible approach for parameter estimation in network dynamics.

Methods

Analyzed models

The system of differential equations in Model 1 is expressed as follows:

We assume that the model expresses the binding and dissociation between two molecules, and that only one complex, _{AB}

The system of differential equations in Model 2 is expressed as follows:

We assume that the molecules, _{2}, _{3}, and _{4}, activate _{1} with linear relationships, and that only one molecule, _{1}, can be measured.

The data with noise were generated by Box-Muller method _{e}(

where

Optimization techniques

Two well-known parameter optimization techniques, the genetic algorithm (GA)

Introduction of the new constraints into the objective function

The objective function in this study is composed of two terms: one is the standard error function between the estimated and measured data, and the other is the constraints obtained by differential elimination. The error function is defined as follows: Suppose that ^{c}
_{i,t}
_{i}
^{m}
_{i,t}
^{c}
_{i,t}
^{m}
_{i,t}

where

Next we define the constraints obtained by differential elimination. In general, differential elimination rewrites the original system of differential equations into an equivalent system, which means that the number of equations is equal in both systems. Thus, we can express the constraint by differential elimination, _{DE}, as the linear combination of the equations in the equivalent system, as follows:

where

Finally, we introduce _{DE} into the objective function,

where

Implementation of differential elimination

All of the symbolic computations for the differential elimination were performed using the _{A} ≻ _{B} ≻ _{AB} in Model 1 and P(Pool) ≻ _{4} ≻ _{3} ≻ _{2} ≻ _{1} in Model 2. Subsequently, we converted the form of the polynomial equations derived by differential elimination to the Java code by using the

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

MN performed the implementation and the calculations, and participated in the design of the study. KH conceived of the study, participated in its design and coordination, and drafted the manuscript. MO, YT, and JM participated in the design of the study, and helped to draft the manuscript. All authors read and approved of the final manuscript.

Acknowledgements

This work was partly supported by a project grant, ‘Development of Analysis Technology for Gene Functions with Cell Arrays’, from The New Energy and Industrial Technology Development Organization (NEDO). KH was partly supported by a Grant-in-Aid for Scientific Research on Priority Areas "Systems Genomics" (grant 20016028) and for Scientific Research (A) (grant 19201039) from the Ministry of Education, Culture, Sports, Science and Technology of Japan. In particular, the authors would like to express their gratitude to Drs. Alexander Sedoglavic, Francois Lemaire, and Francois Boulier of Lille University, for valuable discussions during the course of this work.

This article has been published as part of