School of Computer Science and Manchester Centre for Integrative Systems Biology, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK

Theoretical Physics Division, School of Physics and Astronomy, The University of Manchester, Manchester, M13 9PL, UK

Virginia Bioinformatics Institute, Virginia Tech, Washington Street 0477, Blacksburg, VA 24061, USA

Abstract

Background

Stochastic fluctuations in molecular numbers have been in many cases shown to be crucial for the understanding of biochemical systems. However, the systematic study of these fluctuations is severely hindered by the high computational demand of stochastic simulation algorithms. This is particularly problematic when, as is often the case, some or many model parameters are not well known. Here, we propose a solution to this problem, namely a combination of the linear noise approximation with optimisation methods. The linear noise approximation is used to efficiently estimate the covariances of particle numbers in the system. Combining it with optimisation methods in a closed-loop to find extrema of covariances within a possibly high-dimensional parameter space allows us to answer various questions. Examples are, what is the lowest amplitude of stochastic fluctuations possible within given parameter ranges? Or, which specific changes of parameter values lead to the increase of the correlation between certain chemical species? Unlike stochastic simulation methods, this has no requirement for small numbers of molecules and thus can be applied to cases where stochastic simulation is prohibitive.

Results

We implemented our strategy in the software COPASI and show its applicability on two different models of mitogen-activated kinases (MAPK) signalling -- one generic model of extracellular signal-regulated kinases (ERK) and one model of signalling via p38 MAPK. Using our method we were able to quickly find local maxima of covariances between particle numbers in the ERK model depending on the activities of phospho-MKKK and its corresponding phosphatase. With the p38 MAPK model our method was able to efficiently find conditions under which the coefficient of variation of the output of the signalling system, namely the particle number of Hsp27, could be minimised. We also investigated correlations between the two parallel signalling branches (MKK3 and MKK6) in this model.

Conclusions

Our strategy is a practical method for the efficient investigation of fluctuations in biochemical models even when some or many of the model parameters have not yet been fully characterised.

Background

Random fluctuations in discrete molecular numbers can have significant impact, both detrimental and constructive, on the functioning of biochemical systems

Biochemical systems have evolved to be robust against molecular fluctuations by attenuation, or even to exploit them (see

For a quick characterisation of the fluctuations in a biochemical system there exists an alternative, namely the linear noise approximation (LNA; see,

Often, in practice, one or more of the parameters of a model, such as reaction rates or initial concentrations, cannot be exactly determined. For instance, such parameters might only be known to lie within a certain range or nothing might be known about them at all. This uncertainty about parameters can translate into uncertainty about the system behaviour when it has high sensitivity towards those parameters. This is also true for molecular fluctuations in the system since their expected amplitude and other properties depend on parameter values. If only one or two parameters are unknown it is possible to exhaustively scan this parameter space using a regular grid or other techniques to probe how the model is affected by variations in values of those parameters. However, this approach is not feasible if the number of unknown parameters is large since the hyper-volume of the parameter search space increases exponentially with the number of uncertain parameters, and consequently so does the computational time.

In this article we introduce a different strategy to study random fluctuations in biochemical models with parameters that are not well characterised. Our approach combines the LNA with optimisation methods to search the unknown parameter space for parameter values that lead to extrema in covariance estimates. This can dramatically reduce the required computation time compared to exhaustive searches with stochastic simulations, thereby permitting types of studies of stochastic fluctuations that were not possible before. We will show a relevant biological example of a search for conditions that minimise the noise in the output of a p38 MAPK signalling system. Scanning the parameter space and using stochastic simulation is clearly impossible here because this would take more than 2.4 · 10^{17 }years. Our method, in contrast, was able to find these conditions in 25 min. Therefore, the strategy we are proposing makes it possible to gain biological insight about the noise structure of relevant biological systems even if these systems are big and the parameters are not well defined.

Global optimisation methods have been shown to be effective in finding good extrema estimates of dynamic properties of biochemical network models even in high-dimensional search spaces

The application of this strategy passes through a closed loop containing the automatic calculation of a steady state, the LNA method and one optimisation algorithm; alternatively the method is also appropriate to use with parameter scanning or sampling algorithms instead of the optimisation. We implemented this strategy in the software COPASI

Results

Implementation of the method in COPASI

The software COPASI

Screen shot of the LNA implementation in COPASI

**Screen shot of the LNA implementation in COPASI**. Screen shot of the COPASI graphical user interface and the linear noise approximation task. Shown is the resulting covariance matrix of species' particle numbers in the p38 MAPK model by Hendriks

Our implementation allows arbitrary objective functions to be optimised. For instance, LNA estimates of covariances of different chemical species, as well as other model observables, can be combined into a complex objective function. This allows the calculation of various quantities of interest, for instance, Fano factors

Application of the method on MAPK signalling systems

Signalling through mitogen-activated protein kinases (MAPK) is involved in a broad range of cellular processes, such as proliferation, differentiation, stress responses and apoptosis. Therefore it is also implicated in a variety of diseases like cancer, stroke or diabetes

There exist different specific MAPK signalling pathways with different functions, for example ERK1/2, p38 or JNK, with different topologies and characteristics. However, in most cases the basic structure is that of a three-tier cascade. Here, the MAPKs on the output level, such as ERK1/2 or p38, phosphorylate transcription factors or other proteins to trigger specific cellular responses. The MAPKs are, in turn, activated

Fluctuations in a model of ultrasensitivity in ERK MAP kinase signalling

We will now apply the LNA to a MAPK cascade model due to Kholodenko _{I }

Stochastic simulation of the ERK MAPK model

**Stochastic simulation of the ERK MAPK model**. MKKK and phospho-MKK particle numbers _{I }= 45, _{cell }^{-14 }l and all other parameters as in

It is interesting to see how the magnitude of fluctuations changes with the reaction parameters. As an example, we used our LNA implementation in COPASI in combination with a parameter scan to investigate how changes in the reaction parameter _{2 }affect the variance of MKKK (MAPK kinase kinase). Values of _{2 }were scanned within a certain range and the LNA automatically calculated for each value of _{2}. In the model, this parameter corresponds to the _{max }

Presently protein kinases are much better characterised at the molecular level than protein phosphatases. As a consequence the effect of phosphatases are often also not studied in signalling models. However, here we are able to show that the activity of the MKKK-phosphatase does not only influence the type of dynamics the system exhibits, namely that the steady state becomes unstable at _{2 }= 0.446 due to a Hopf bifurcation. It also strongly affects the intrinsic fluctuations in the system. As can be seen in Figure _{2 }approaches the bifurcation point and, interestingly, it shows a local maximum at _{2 }= 0.32 of 987.7 particles^{2}. The value of _{2 }in Figure

Parameter scan of MKKK particle number variance against reaction parameter ** v _{2 }**in the ERK MAPK model

**Parameter scan of MKKK particle number variance against reaction parameter **. A parameter scan of the variance of the particle number of species MKKK has been carried out for a range of values of the reaction parameter

We then wanted to investigate the conditions under which fluctuations in chemical species at different positions of the signalling cascade become correlated. To achieve this, we used the optimisation task in COPASI to maximise the covariance of the fluctuations of MKKK and MKK-P, allowing the reaction parameters _{2 }and _{4 }to vary over a given range of values. Using the evolutionary programming algorithm ^{2 }for _{2 }= 0.3226 and _{4 }= 0.0166. The algorithm converged to this value already after 880 iterations. A parameter scan over the same parameter space was also performed to better illustrate the change in correlation with these two parameters. Figure

Two-dimensional parameter scan of MKKK and MKK-P particle numbers' covariance in the ERK MAP model

**Two-dimensional parameter scan of MKKK and MKK-P particle numbers' covariance in the ERK MAP model**. A two-dimensional parameter scan of the covariance of the particle numbers of species MKKK and MKK-P. The parameter _{2 }was varied between 0.22 and 0.41 and the parameter _{4 }between 0.015 and 0.035.

Fluctuations in a model of p38 MAPK signalling

The so-called p38 mitogen-activated protein kinases (p38 MAPK) are responsive to proinflammatory cytokines and stress factors

The model we use for this study was developed in Hendriks

**Model of p38 MAPK signalling**. The model of p38 MAPK signalling

Click here for file

As mentioned above, random fluctuations in signalling systems are particularly interesting to study, since here copy numbers of the different species are often low. For instance, MKK3 and MKK6 are typically present in the order of only ten thousand particles per cell. This could lead to pronounced fluctuations which hamper reliable information transfer through this signalling pathway. But perhaps there are conditions (parameter values) for which these fluctuations are minimised, which is what we want to investigate.

First we looked at the estimated variances of different signalling intermediates, such as phospho-MKK3, phospho-MKK6, cytosolic phospho-p38 and nuclear phospho-p38 with varying stimulus strength,

Variances of species' particle numbers versus stimulation strength in the p38 MAPK model

**Variances of species' particle numbers versus stimulation strength in the p38 MAPK model**. Panel A: Variance of cytosolic phospho-MKK3 (×), phospho-MKK6 (□), phospho-p38 (○), and nuclear phospho-p38 (Δ) particle numbers

By contrast, phospho-Hsp27, the endpoint of the modelled signalling pathway, shows a decrease in its variance with increasing stimulation (Figure

However, looking at the coefficient of variation (CV) both nuclear phospho-p38 and cytosolic phospho-Hsp27 show a decrease of variation with increasing stimulation due to increasing steady state particle numbers (Figure

Coefficient of variation of nuclear phospho-p38

**Coefficient of variation of nuclear phospho-p38 vs. stimulation strength in the MAPK model**. Coefficient of variation of nuclear phospho-p38

An interesting property of the p38 MAPK pathway is the existence of two parallel signalling branches, through MKK3 and MKK6, that both can phosphorylate p38 MAPK. Therefore, we were interested in whether fluctuations in the MKK3 branch correlate with fluctuations in the MKK6 branch. First, we scanned the estimated covariance of phospho-MKK3 and phospho-MKK6 over a range of stimulus strengths. We found that the fluctuations in the two branches seem to be mostly uncorrelated (the LNA actually estimates a very weak anti-correlation for higher initial concentrations of LPS, data not shown), an indication that the largest part of the fluctuations does not originate from the common upstream part of the two branches but rather from within the branches themselves.

We now wanted to investigate how the parameters in the system influence this anti-/correlation. Therefore, we searched for extreme values of the LNA-estimated correlation coefficient of phospho-MKK3 and phospho-MKK6

We therefore ran the LNA in combination with the particle swarm optimisation algorithm of COPASI, using the correlation coefficient as the objective function for maximisation. In addition we set constraints on the number of steady state particle numbers in the system. Both phospho-MKK3 and phospho-MKK6 particle numbers were allowed to change only 4-fold,

We used a particle swarm optimisation

Finally, we were interested in the influence that different choices for parameters in the two branches have on the fluctuations of the output of the signalling pathway (phospho-Hsp27) or, in other words, how reliable or noisy the overall signalling pathway can be. We used a particle swarm optimisation (swarm size = 50) ^{6 }particles. With the original parameter set the steady-state particle number of phospho-Hsp27 was 4.647 · 10^{6 }particles. The result of this second calculation is shown in column "Changes (constrained)" of Table

Optimisation of the coefficient of variation of phospho-Hsp27 particle numbers

**Reactions**

**Changes**

**(no constraints)**

**Changes**

**(constraints)**

complex + MKK6 ↔ complex_MKK6

⇒

⇒

complex_MKK6 → complex + MKK6P

⇒

⇒

MKK6_phosphatase + MKK6P ↔ Ppase_MKK6P

⇐

~

Ppase_MKK6P → MKK6_phosphatase + MKK6

⇐

⇐

complex + MKK3 ↔ complex_MKK3

⇒

⇒

complex_MKK3 → complex + MKK3P

⇒

⇒

MKK3_phosphatase + MKK3P ↔ Ppase_MKK3P

⇐

⇐

Ppase_MKK3P → MKK3_phosphatase + MKK3

⇐

⇐

MKK6P + p38 ↔ MKK6P_p38

⇒

⇒

MKK6P_p38 → MKK6P + p38P

⇒

⇒

MKK3P + p38 ↔ MKK3P_p38

⇒

⇐

MKK3P_p38 → MKK3P + p38P

⇒

⇐

p38_phosphatase + p38P ↔ Ppase_38P

⇐

⇔

Ppase_p38P → p38_phosphatase + p38

⇐

⇒

Optimisation of the coefficient of variation of phospho-Hsp27 particle numbers with regards to all 21 reaction parameters of the listed reactions ([LPS]_{0 }= 1 ng/ml). "Changes (no constraints)" means that the coefficient was optimised without any further constraints, whereas "Changes (constrained)" means that during optimisation the phospho-Hsp27 particle number was constrained in the optimisation to stay below the limit of 4.65 million particles. "⇒" ("⇐") denotes an increase (decrease) in the forward rate and a decrease (increase) in the reverse rate, in case of a reversible reaction. "⇔" means that both forward and reaction rates are increased and "~" means that the optimisation led to no clear change

We would like to note here that a (naive) comprehensive search for optima using a regular grid approach and stochastic simulations of the system in this particular case would have taken a prohibitively long computation time. Assuming that, within the given limits, we only look at ten different values per parameter we would have 10^{<no.parameters>}= 10^{21 }sample points. For each point we would need to carry out a stochastic simulation that, including the calculation to allow the system to settle down to a steady state, takes approximately 7700 s on a typical desktop computer (for a simulated time of 10000 s). Neglecting the time needed to calculate the actual statistics on the simulated time series this would lead to a computation time of more than 10^{21 }· 7700 s ≈ 2.4 · 10^{17 }years. And this would only explore ten values of each parameter (

Discussion

Our contribution with this work is two-fold. First, we implemented the linear noise approximation in the freely available software COPASI, and thus made it accessible to a large group of users. Secondly, we showed how the LNA in combination with multi-dimensional parameter scans or with global numerical optimisation methods is appropriate to quickly characterise the influence of parameters on intrinsic fluctuations in biochemical models even when there is considerable uncertainty about a number of parameters. We showed, with realistic biochemical signalling models, that using this approach one is able to explore parameter space such that conditions can be found for which there is minimal, or maximal, noise. It is also possible to search for conditions where specific model variables are highly (or poorly) correlated. This new method thus provides a new and important way to explore the universe of behaviours displayed by models. Given the importance of noise and fluctuations in intracellular biochemistry, this method is therefore of great value for the study of those systems.

In the recent article by Komorowski

In certain cases, however, care should be taken when using the LNA. This is due to the assumption that the fluctuations are Gaussian in nature. Problems can arise if the system is close to a boundary. For example, if the number of molecules for a particular species is very close to zero the probability distribution for the fluctuations becomes 'squashed' (which the LNA does not take into account), to satisfy the requirement that the probability to have a negative number of molecules present is zero. Boundaries can also arise due to conservation relations, which are discussed in the Methods section, as these add constraints to the system. When using the LNA in combination with one of the optimisation algorithms in COPASI, such systems near boundaries are sometimes found, especially when the user wishes to minimise a covariance, as we found when studying the p38 MAPK model. This is because the fluctuations can be very small when the system is close to a boundary, which can give the impression that the fluctuations of two different species are uncorrelated, which may not be the case away from the boundary. In these cases, adding constraints to the particle numbers (as we did when studying the p38 MAPK model) helps to keep the system away from these states. The current implementation of the LNA in COPASI is only able to consider models in which the reactions all occur within one compartment. As many biochemical models involve multiple compartments we hope to extend our work, so that in future it will be possible to use the LNA to study a wider range of models.

Methods

Biochemical network models of the kind we analyse here can be described as consisting of

where the numbers _{iμ }_{iμ }

All the reactions above are strictly irreversible, therefore, without loss of generality, any chemically reversible reactions must be described as two separate irreversible reactions. The elements of the stoichiometry matrix, _{iμ }_{iμ }_{iμ}_{i }

Michaelis-Menten reaction mechanism

**Reactions**

**Kinetics**

→ S

_{1 }= _{1}

S + E → SE

_{2 }= _{2 }· [S] · [E]

SE → S + E

_{3 }= _{3 }· [SE]

SE → P + E

_{4 }= _{4 }· [SE]

P →

_{5 }= _{5 }· [P]

A substrate, S, is converted to a product, P, _{1 }= S, _{2 }= E, _{3 }= P and _{4 }= SE. Also, _{11 }= 1, _{21 }= 0 and so on. The total number of enzyme molecules,

To specify the model, kinetic functions ** n **= (

(The ODEs for the species that have been eliminated can be found by using the conservation relations.) However, the large system size limit is inappropriate for many systems of interest, in particular when the molecular populations are low (and the volume is small, as in most cells), then the discrete nature of the molecules has important consequences. In these cases a stochastic description is required.

The starting point for the stochastic description is the chemical master equation, which specifies how the probability that the system is in the state ** n **at time

where _{μ }= (_{1μ}, ..., _{Kμ}** n**, 0) is given. If we multiply Eq. (3) by

Dividing Eq. (4) by _{μ}** x**) = lim

Where

In all the investigations we will carry out in this paper, we will be interested in fluctuations about the stationary state. In terms of the deterministic dynamics Eq. (2), the solution ** x**(

The Fokker-Planck Eq. (5) is linear, and so therefore its solution, Π(** ξ**,

All the matrices in Eq. (7) are dimension

Since 〈_{i}_{i }

The Lyapunov equation, analogous to Eq. (7) is therefore

The equation can be solved for ^{T }

Therefore our implementation of the LNA first automatically determines existing conservation relations (also known as conserved moieties) and reduces the system from

For convenience, the state vector ** n **should be written with the

where the _{j }_{jk }

Examining the conservation relations after the change of variables used in the van Kampen expansion

But the conservation equations should hold in the deterministic limit (

Therefore,

We can now use the above results to compute the remaining covariances. First of all we calculate Ξ_{ij}

since 〈ξ_{i}_{j}_{ij }_{ij }

Again, we have obtained an expression in terms of known quantities. As before, _{ij }_{ij}

we can write this more concisely:

with ^{red. }the

We will illustrate the procedure by examining the Michaelis-Menten reaction mechanism, described earlier in Table

where _{1 }is the concentration of species S, _{2 }is the concentration of SE, _{3 }is the concentration of P and _{4 }is the concentration of E. The system contains one conservation relation, as the total number of enzyme molecules (whether they are free, or bound in the intermediate complex) is constant. We will write this as _{2 }+ _{4 }= _{4 }from the ODEs, and re-write them in a simpler form,

The steady state is calculated by setting the time derivatives to zero and solving the resulting equations simultaneously. The steady state values for the concentrations are shown below:

From Eq. (6),

Once values of the reaction parameters have been chosen, numerical values of

Covariance matrix

**S**

**SE**

**P**

**E**

**S**

1455.82 (1455.57)

61.35 (61.23)

33.46 (32.80)

-61.35 (-61.23)

**SE**

61.35 (61.23)

59.09 (59.07)

-4.36 (-4.43)

-59.09 (-59.07)

**P**

33.46 (32.80)

-4.36 (-4.43)

773.86 (773.41)

4.36 (4.43)

**E**

-61.35 (-61.23)

-59.09 (-59.07)

4.36 (4.43)

59.09 (59.07)

The convariance matrix _{1 }= 0.2 nMs^{-1}, _{2 }= 4 nM^{-1}s^{-1}, _{3 }= 3 s^{-1}, _{4 }= 1 s^{-1}, _{5 }= 0.15 s^{-1}. The system volume was 10^{-12 }l. The covariances calculated using the LNA was compared with those obtained from simulation (values in brackets) using 10^{4 }time series generated using the Gillespie algorithm (Direct Method) in COPASI

As just mentioned, we implemented the LNA described above in the software COPASI

Briefly, our LNA implementation in COPASI first detects dependent species (conservation relations) and carries out the corresponding reduction of the system, if needed. Then an automatic search for a steady state of the system is started. If a steady state has been found the Lyapunov equation Eq. (10) for the reduced system is solved at this steady state using the Bartels-Stewart algorithm. Finally, the covariance matrix for the full system is recovered as described above.

In addition, before the LNA is carried out COPASI automatically checks the model according to a number of criteria that preclude a direct calculation of the LNA. For instance, if there are reversible reactions present in the model COPASI will notify the user that all reversible reactions have to be split into irreversible reactions before the LNA can be applied. There exists a tool in COPASI which can do this in an automated way for a large class of models.

Optimisation is a general modelling tool with a wide application to the solution of diverse problems. Essentially, if something can be specified as a maximum or minimum of some function, optimisation will be the way to solve such a problem. In biochemical network modelling the most common application is parameter estimation; another one is the design of genetically engineered pathways (commonly known as metabolic engineering) where one seeks to maximise a flux, titre or a yield of a biotransformation

There are many different numeric algorithms for searching minima (or maxima) of functions: the traditional gradient-based methods, direct search that use geometric heuristics, population-based algorithms like evolutionary algorithms and particle swarm

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

JP developed the original idea, designed the study, implemented the method in the software COPASI, carried out the calculations on the p38 MAPK model and wrote the manuscript. JDC contributed to the implementation of the method, carried out the calculations on the Michaelis-Menten and ERK MAPK models and wrote the manuscript. PM contributed to the implementation of the method and the design of the study and wrote the manuscript. AJM developed the original idea, contributed to the design of the study and wrote the manuscript. All authors read and approved the final manuscript.

Acknowledgements

JP and PM thank the UK's BBSRC (grant BB/F018398/1) and BBSRC/EPSRC (grant BB/C008219/1), the USA's NIH (grant GM080219), the German Federal Ministry of Education and Research (BMBF), the Klaus Tschira Foundation, and the EU FP7 project UniCellSys, grant no. 201142. JDC thanks EPSRC for the award of a PhD studentship. This is a contribution from the Manchester Centre for Integrative Systems Biology.