Deparment of Mathematics, Bioinformatics Program, Georgia Institute of Technology, Atlanta, GA30332, USA

The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA30332, USA

Integrative BioSystems Institute and The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA30332, USA

Abstract

Background

Gillespie's stochastic simulation algorithm (SSA) for chemical reactions admits three kinds of elementary processes, namely, mass action reactions of 0^{th}, 1^{st }or 2^{nd }order. All other types of reaction processes, for instance those containing non-integer kinetic orders or following other types of kinetic laws, are assumed to be convertible to one of the three elementary kinds, so that SSA can validly be applied. However, the conversion to elementary reactions is often difficult, if not impossible. Within deterministic contexts, a strategy of model reduction is often used. Such a reduction simplifies the actual system of reactions by merging or approximating intermediate steps and omitting reactants such as transient complexes. It would be valuable to adopt a similar reduction strategy to stochastic modelling. Indeed, efforts have been devoted to manipulating the chemical master equation (CME) in order to achieve a proper propensity function for a reduced stochastic system. However, manipulations of CME are almost always complicated, and successes have been limited to relative simple cases.

Results

We propose a rather general strategy for converting a deterministic process model into a corresponding stochastic model and characterize the mathematical connections between the two. The deterministic framework is assumed to be a generalized mass action system and the stochastic analogue is in the format of the chemical master equation. The analysis identifies situations: where a direct conversion is valid; where internal noise affecting the system needs to be taken into account; and where the propensity function must be mathematically adjusted. The conversion from deterministic to stochastic models is illustrated with several representative examples, including reversible reactions with feedback controls, Michaelis-Menten enzyme kinetics, a genetic regulatory motif, and stochastic focusing.

Conclusions

The construction of a stochastic model for a biochemical network requires the utilization of information associated with an equation-based model. The conversion strategy proposed here guides a model design process that ensures a valid transition between deterministic and stochastic models.

Background

Most stochastic models of biochemical reactions are based on the fundamental assumption that no more than one reaction can occur at the exact same time. A consequence of this assumption is that only elementary chemical reactions can be converted directly into stochastic analogues

Similar model reduction efforts have been carried out for stochastic modelling. For instance, the use of a complex-order function (which corresponds to a reduced equation-based model) was shown to be justified for some types of stochastic simulations. A prominent example is again the Michaelis-Menten rate law, which can be reduced from a system of elementary reactions to an explicit function by means of the

In general, the construction of a stochastic model for a large biochemical network requires the use of information available from an equation-based model. In the past, several strategies have been proposed for this purpose and within the context of Gillespie's exact stochastic simulation algorithm (SSA; ^{th}, 1^{st }and 2^{nd }order reactions follow mass action rate laws. More recently the moment method was extended to cover models consisting of rational rate laws

In this article, we explore the mathematical connection between deterministic and stochastic frameworks for the pertinent case of Generalized Mass Action (GMA) systems, which are frequently used in Biochemical Systems Theory (BST;

Representations of systems of biochemical reactions

Consider a well-stirred biochemical reaction system with constant volume and temperature, where _{s }_{r }

where _{s }_{r}_{r }_{r}_{s }_{r}_{r }

The stoichiometric vectors of all reactions can further be arranged as the stoichiometric matrix of the system

The size of the system is defined as Φ =

The modelling of biochemical reaction networks typically uses one of two conceptual frameworks: deterministic or stochastic. In a deterministic framework, the state of the system is given by the a non-negative vector _{s}_{s}_{s}_{s }_{s }_{s}

Motivation for the power-law formalism: reactions in crowded media

Power-law functions with non-integer kinetics have proven very useful in biochemical systems analysis, and forty years of research have demonstrated their wide applicability (

Within the conceptual framework of power-law representations, the rate of the association reaction between molecules of species _{1 }and _{2 }is given as _{1 }and _{2 }are real-valued _{3 }molecules as

The first term on the right-hand side of this equation, _{1}], [_{2}])Δ_{1}_{2}, describes the production of _{3}: it depends on the totality of possible collisions _{1 }_{2 }and also on some fraction _{1}], [_{2}])Δ_{1}], [_{2}]) equals a traditional rate constant, and the reaction obeys the law of mass action, while in a spatially restricted environment, such as the cytoplasm, one needs to take crowding effects into account. As shown in Savageau _{1 }and _{2}. The second term, _{3}]) Δ_{3}, describes the fraction _{3}]) Δ_{3 }that dissociates back into _{1 }and _{2}. This fraction may depend on some functional form of [_{3}] because in a crowded environment the complex may not be able to dissociate effectively. Thus, rate constants in the generalized mass action setting become rate functions (

By taking the limit Δt → 0, one obtains the differential equation

Savageau used Taylor series expansion to approximate the functions _{1}], log[_{2}]) around some operating point (

where _{f}

The same procedure leads to the power-law expression for the degradation term: _{3}]) ≈ _{d }_{3}]^{γ}_{3 }as

where _{f}_{d}

The Generalized Mass Action (GMA) format

In the GMA format within Biochemical Systems Theory, each process is represented as a univariate or multivariate power-law function. GMA models may be developed _{0}) at some initial time _{0}. Generically, the state of the system is changed within a sufficiently small time interval by one out of the _{r }_{r }

for those _{s}_{rs }_{s }_{rs}_{rs }_{rs}_{s }

for every _{s}_{rs }_{s}_{rs }_{rs }_{s }_{r}_{s }_{rs }_{s }_{r }_{r}

Proper use of equation-based functions for stochastic simulations

The fundamental concept of a stochastic simulation is the propensity function **X**), and **X**)**X**) = _{s}**X**), if the deterministic model is _{s}_{s}**X**, _{s}

1)

2) the reaction is monomolecular;

3) all _{i }

Each of these assumptions constitutes a sufficient condition for the direct use of a rate function as the propensity function and applies, in principle, to GMA as well as other systems. The validity of these conditions will be discussed later. Specifically, the first condition will be addressed in the Results section under the headings "0^{th}-order reaction kinetics" and "1^{st}-order reaction kinetics, " while the second condition will be discussed under the heading "Real-valued order monomolecular reaction kinetics." The third condition will be the focus of Equations (29-36) and their associated explanations.

In reality, the rates of reactions in biochemical systems are commonly nonlinear functions of the reactant species, and fluctuations within each species are not necessarily ignorable. Therefore, to the valid use of an equation-based model in a stochastic simulation mandates that we know how to define a proper propensity function. The following section addresses this issue. It uses statistical techniques to characterize estimates for both the mean and variance of the propensity function, and these features will allow an assessment of the validity of the assumption **X**) = _{s}**X**) and prescribe adjustments if the assumption is not valid.

Methods

Deriving the mean and variance of a power-law function of random variables

Consider a generic power-law function of random variables _{s }_{PL }_{PL }

(for details, see Additional file

**Derivation of the mean and variance of a power-law function of random variables**.

Click here for file

and _{s }_{s}_{s}_{i}_{j}_{s}_{s}^{2 }and covariance _{ij }_{i}_{j}

where

Since many biochemical variables approximately follow a log-normal distribution _{1}, ..., _{s}_{1}, ..., log_{s}_{i}_{j}

where

The approximation formulae for _{PL }_{PL}^{2 }in eqns. (8)-(10) provide an easy numerical implementation if observation data are available to estimate cov [log_{i}_{j}_{PL }_{PL}^{2 }are related to _{s}_{s}^{2 }and _{ij}_{PL }_{PL}^{2 }on (_{s}_{s}^{2}, _{ij}

Deriving proper propensity functions for stochastic simulations from differential equation-based models

Assuming that the GMA model faithfully captures the average behaviour of a biochemical reaction system and recalling

where Φ is the system size as defined above.

To describe the reaction channel _{r }**v**_{r }and must characterize the quantity of molecules flowing through of reaction channel _{r }_{r}(**x**), which is defined as

**X**(t) is no longer deterministic, and the result is instead stochastic and based on the transition probability

which follows the chemical master equation (CME)

Updating CME requires knowledge of every possible combination of all species counts within the population, which immediately implies that it can be solved analytically for only a few very simple systems and that numerical solutions are usually prohibitively expensive ^{th }order reactions, exemplified with the generation of a molecule at a constant rate; 2) 1^{st }order monomolecular reactions, such as an elemental chemical conversion or decay of a single molecule; 3) 2^{nd }order bimolecular reactions, including reactive collisions between two molecules of the same or different species. The reactive collision of more than two molecules at exactly the same time is considered highly unlikely and modelled as two or more sequential bimolecular reactions.

For elementary reactions, the propensity function of reaction _{r }_{r }_{r }

Here _{s }_{s}_{s }_{s }_{s }

- v

In Gillespie's original formulation _{r }_{r}dt _{r }

Since the assumption of mass action kinetics is not valid generally, especially in spatially restricted environments and in situations dominated by macromolecular crowding, we address the broader scenario where c_{r }is not a constant but a function of the reactant concentrations. Thus, we denote c_{r }as a _{r }

Here, _{r }_{rs }_{r}_{rs }

In order to identify the functional expression for a stochastic rate function, and thus the propensity function, we consider the connection between the stochastic and the deterministic equation models. By multiplying CME with **x **and summing over all **x**, we obtain

Similarly, the expectation for any species _{s}

The details of these derivations are shown in Additional file

We can use these results directly to compute the propensity function for a stochastic GMA model, assuming that its deterministic counterpart is well defined. Specifically, we start with the deterministic GMA equation for _{s}

where _{rs}_{r }_{rs' }are again the stoichiometric coefficients, rate constants, and kinetic orders, respectively. By substituting

Elementary operations allow us to rewrite this equation as

where

Now we have two choices for approximating the expectation of the propensity function on left-hand side of equation (29):

1) adopt a zero-covariance assumption as was done in

for every _{r}

and

Here, the index

With the zero-covariance assumption, one can substitute (32) back into the equation for the expectation for each species, which yields

for every _{s}

Equation (33) is based on assumption that both the fluctuations within species and their correlations are ignorable, which is not necessarily true in reality. If one uses it in simulations where the assumptions are not satisfied, it is possible that the means for the molecular species are significantly different from the corresponding equation-based model values. This discrepancy arises because the evolution of each species in the stochastic simulation is in truth affected by the covariance which is not necessarily zero, as it was assumed. This phenomenon was observed by Paulsson and collaborators _{r_0 }and obtain mean and variance as

where

for every _{s}^{th }order; 2) the reaction is a real value-order monomolecular reaction, with 1^{st }order reaction as a special case; 3) the covariance contribution in (34) is sufficiently small to be ignored for all participating reactant species of a particular reaction channel. Except for these three special situations, the covariance as shown in (34) significantly affects the mean dynamics. Therefore, stochastic simulations using zero-covariance propensity functions will in general yield means different from what the deterministic GMA model produces. How large these differences are cannot be said in generality. Under the assumption that the GMA model correctly captures the mean dynamics of every species, this conclusion means that _{r_0 }is not necessarily an accurate propensity function for stochastic simulations, and the direct conversion of the equation-based model into a propensity function must be considered with caution.

Moreover, there is no theoretical basis to assume that there are no fluctuations in the molecular species or that these are independent. Therefore, we need to consider the second treatment of the expectation of the propensity function and study the possible effects of a non-zero covariance.

2) We again assume that the GMA model is well defined, which implies that information regarding the species correlations and fluctuations has been captured in the parameters of the GMA model on the left hand size of Equations (7) and (28). To gain information regarding correlations, we use Taylor expansion to approximate the propensity function (see Additional file

After substitution of (37) in (29), one obtains

Given the state **x **of the system at time _{r }

Here it is important to understand that although the random variables {_{s}_{s∈S }appear in the expression _{r}**x**), _{r}**x**) is not a function of random variables but a deterministic function. The reason is that the cov [log_{i}_{j}_{r}**x**), which as the numerical characteristic of the random variables {_{s}_{s∈S}, is deterministic. Therefore, the stochastic rate function _{r}**x**) is a well-justified deterministic function that is affected by both the state of the system _{i}_{j}_{s}_{s∈S}.

Given the expression _{r}**x**), the propensity function is

These results are based on the assumption that there are large numbers of molecules for all reactant species participating in reaction _{r}_{r }

_{r}_{r_cov}, in order to distinguish it from the propensity function _{r_0 }(32), which is based on the assumption of zero-covariance,

Remembering that cov [log_{i}_{j}_{r}**x**) and now in the function _{r }_{r_cov }in (41), which corrects the stochastic simulation toward the correct average.

In contrast to the propensity function _{r_0}, _{r_cov }leads to accurate stochastic simulations. To illustrate this difference, we analyze _{r_cov}:

Here

By substituting (42) back into the derivation of CME (26), one obtains

for every _{s}_{r_cov }in the CME derived equation (27) is approximately identical to the corresponding macroscopic variable in the GMA model.

**Calculation of **cov [log_{i}_{j}**]**

When data in the form of multiple time series for all the reactants are available, it is possible to compute cov [log_{i}_{j}**] **directly from these data. Once this covariance is known, the function _{r_cov }and the mean dynamics can all be assessed. Alas, the availability of several time series data for all reactants under comparable conditions is rare, so that cov [log_{i}_{j}**] **must be estimated in a different manner.

If one can validly assume that the covariance based on _{r_0 }does not differ significantly from the covariance based on _{r_cov}, one may calculate cov [log_{i}_{j}**] **by one of following methods.

Method 1:

One uses _{r_0 }to generate multiple sets of time series data of all reactants and then computes cov [log_{i}_{j}**]**.

Method 2:

First, cov [log_{i}_{j}**] **is expressed as a function of mean and covariance in one of the following ways; either as

or as Equation (14):

The first functional expression of cov [log_{i}_{j}**] **is achieved by Taylor approximation, whereas the second expression is obtained by the additional assumption that the concentrations (_{1}, ..., _{s}

Second, one uses _{r_0 }to approximate the mean and covariance either by direct simulation, as shown in method 1, or by a moment-based approach, which is explained in Additional file

**Computation of approximate mean and covariance for a generic propensity function to be used in stochastic simulations**.

Click here for file

For convenience of computational implementation, the above equations can be written in matrix format

Here for _{r}_{s}_{rs }= _{rs}

Statistical criteria for propensity adjustment

Suppose an equation-based model captures the average behavior of a stochastic system and one intends to find the propensity function for a stochastic simulation that will reproduce that means. One can use the 95% confidence interval to evaluate the need for a propensity adjustment. Specifically, for stable systems that will reach a steady state, we use the reversible reaction model as an example. If the steady state of the ODE _{st }_{st }_{st }

For other systems that do not reach a steady state, but where instead transient characteristics are of the highest interest, one can judge the need of propensity adjustment by whether the pertinent characteristics of the ODEs are within the 95% confidence interval of the corresponding characteristic, which is given by a prediction from the moment-based method or from

Results

Generic special cases

It is generally not valid to translate a rate from a deterministic biochemical model into a propensity function of the corresponding stochastic simulation without adjustment (see Equations. (34)-(36)). However, in some situations, the propensity adjustment (

1) 0^{th}-order reaction kinetics

Consider a very simple equation-based model of the type

for all _{s}_{rs }

Thus, for a 0^{th}-order reaction, its rate equation can be taken directly as the propensity function in stochastic simulations.

2) 1^{st}-order reaction kinetics

Direct application of Equations (40)-(44) yields

_{rs }_{sj}_{s}

Thus, for 1^{st}-order reactions, the rate equation can again be taken directly as the propensity function in stochastic simulations.

3) Real-valued order monomolecular reaction kinetics

Consider a reaction with kinetics of the type

_{rj }_{rs }_{s}

Thus, for reaction kinetics involving a single variable and a real-valued order, the rate equation can again be taken as the propensity function in stochastic simulations.

4) 2^{nd}-order reaction kinetics

This type of reaction can be expressed as

_{s}_{ri }_{rj }_{rs }

Thus, the proper propensity function for 2^{nd}-order reactions is different from the rate equation. The difference can be ignored only if the contribution from the covariance is insignificant. In general, the rate equation yields only an approximate propensity function for stochastic simulations, and the approximation quality must be assessed on a case-by-case basis.

5) Bimolecular reaction with real-valued order kinetics

This type of reaction can be formulated as

_{s}_{ri}_{rj }_{rs }

For bimolecular reactions of complex order, the propensity function is different from the rate equation. The difference can be ignored only if the contribution from the covariance is insignificant.

Power-law representation of a reversible reaction with feedback controls

We consider a reversible reaction with feedback controls (see Figure

Scheme of reversible reaction with feedback controls

**Scheme of reversible reaction with feedback controls**. S_{3 }inhibits the forward reaction and S_{1 }activates the reverse reaction.

Here _{3 }feeds back to inhibit the forward reaction and _{1 }feeds back on the reverse reaction and accelerates it. The task is to develop a stochastic model whose performance converges to that of the deterministic GMA model. We can see from equations (52) that three variables _{1}, _{2 }and _{3 }contribute to the forward flux _{1 }and _{3 }contribute to the backward flux

To simplify the calculation, as explained in detail in Additional file

Here _{1}, _{2}, _{3})^{T}

Moreover, for _{1}", _{2}")^{T}_{1}"⊙ _{2}"⊙ ^{T}_{1}', _{2}'),

Two initial conditions are chosen for representative simulations; they differ by a factor of 20 in species populations and reaction volume between the upper and lower panels of Figure

Comparative simulation results for a reversible reaction with feedback controls

**Comparative simulation results for a reversible reaction with feedback controls**. In all panels, the _{1}. The upper and lower panels use two different sets of initial numbers of molecules, namely: (_{1}(0), _{2}(0), _{3}(0), ^{3}) and (_{1}(0), _{2}(0), _{3}(0), ^{3}), respectively. Other simulation parameters are (_{1}, _{2}, _{3}, _{1}, _{3}, _{f}_{g}_{1 }molecules by different methods: the black line shows the ODE solution of Equation (52) for _{1 }; the blue lines are the solutions of Equation (53) for _{1 }and for _{1 }± _{1}, respectively. The red dotted lines framing the mean indicate the 95% confidence interval. The second column shows the propensity adjustment functions for the forward reaction (solid line) and the backward reaction (dashed line). The third column shows 100 independent stochastic simulations with propensity adjustment (blue means and error bars), in comparison with the ODE (Equation (52)) prediction (black line). The fourth column shows a second set of 100 independent stochastic simulations without propensity adjustment (blue means and error bars), in comparison with the ODE (Equation (52)) prediction (black line). The red dotted lines framing the mean in columns 3 and 4 again indicate the 95% confidence intervals.

Repressilator

Interestingly, a propensity function may even be obtained through power-law approximation of some function that describes complex transient behaviours of a reaction network. As an example, consider the so-called _{1 }codes for protein _{1}, whose dimer _{1 }subsequently represses the transcription of the gene _{2}. Similarly, _{2}, the dimer of gene _{2}'s protein product _{2}, represses the transcription of gene _{3}, and _{3}, the dimer of gene _{3}'s protein product _{3}, represses the transcription of gene _{1}. The corresponding differential equation model following mass action kinetics is given by

Reaction scheme of the Repressilator

**Reaction scheme of the Repressilator**. Gene _{1 }codes for protein _{1}, whose dimer _{1 }represses the transcription of gene _{2}. Similarly, _{2}, the dimer of gene _{2}'s protein product _{2}, represses the transcription of gene _{3}, and _{3}, the dimer of gene _{3}'s protein product _{3}, represses the transcription of gene _{1}.

where

Assuming that the reversible dimerization and the dissociation/association of a protein dimer from/to the promoter are much faster than other processes, the full systems can be reduced to

_{p }_{+}/_{-}, _{d }_{+}/_{- }and _{0, i }+ _{r, i }

In

Intriguingly, one makes the following observation. The scaled ODE system (56) is consistent with the original system (55) in oscillation amplitude and period. However, its corresponding stochastic model produces results that deviate substantially from the average responses. To see the effects of the transition from a deterministic to a stochastic model, we apply SSA to the scaled system (56). The main result is that the oscillation periods of both _{i }_{i }_{i }_{i}

Scaling of the Repressilator equations changes the oscillation period in the stochastic simulation

**Scaling of the Repressilator equations changes the oscillation period in the stochastic simulation**. Solid lines represent solutions of ODEs (56), while dotted lines are trajectories of a stochastic simulation; blue lines represent _{1 }and black lines represent _{1}.

We can see from equations (55) that two variables _{i }_{i }_{i}_{i }

The influence of the covariance on the dynamics of the stochastic simulation is relatively easy to assess: we simply use the terms on the right-hand side of the differential equations (54) as the propensity functions in SSA and obtain simulation results shown in the 2^{nd }and the 4^{th }panels of Figure

Power-law approximation of _{i}^{-1}

**Power-law approximation of p(x**. Left panel: Approximation of

First, the moment-based approach requires information regarding the first and the second derivatives of _{i}^{-1}, which have rather complicated functional forms. To simplify the calculation, we replace the function _{i}^{-1 }with an approximating power-law function. Specifically, suppose the original parameter values are _{+ }= _{+ }= 5, _{- }= _{- }= 100 and _{i}_{i}^{-1})in log-log space (Figure

for _{i }

which models the original function very well (see Figure _{i }_{i}^{-1 }on the right-hand side of (55) is replaced by a power-law function (see Figure

Solving the technical issues as described, one obtains the corresponding moment-based model of (55) (not shown) with results shown in Figure

Comparison of the dynamics of the Repressilator models using the original ODEs (55), the GMA approximation, and the moment approach based on the GMA approximation

**Comparison of the dynamics of the Repressilator models using the original ODEs (55), the GMA approximation, and the moment approach based on the GMA approximation**. The mean of the moment approach based on the GMA approximation fits the original ODEs (55) very well up to about _{1 }molecules (unitless).

Enzymatic reaction using a quasi-steady state assumption (QSSA)

We consider an enzymatic reaction following the Michaelis-Menten mechanism:

Here enzyme E reacts with substrate S through a reversible reaction to form complex ES, which can proceed to yield product P and to release the enzyme E. By assuming the law of mass action for the reaction kinetics we obtain a set of differential equations for the system dynamics:

where the total amount of enzyme in the form of free enzyme and complex [_{0 }≜ [

which is known as _{max }= _{2}[_{0 }and _{m }_{-1 }+ _{2})/_{1}.

Applying QSSA, Rao and Arkin

where the volume was scaled so that Φ = 1 and the lower-case letter

First, we recast the equation-based model into the GMA format _{m }

is exactly equivalent to the reduced system in (58) with the initial condition [_{0 }and [_{0 }= _{m }_{0}. The corresponding stochastic model has only one reaction channel and the propensity function is

The propensity adjustment factor can be set to 1 because _{m }

Thus, we arrive at the propensity function for the reduced system, which is identical to the result of Rao and Arkin obtained through manipulations of CME.

In the above derivation, we used the simplest type of recasting, where a new, auxiliary variable simply consists of an old variable plus a constant. This reformulation of the Michaelis-Menten process as a pair of GMA equations is a special case of a much more general recasting technique that permits the equivalent conversion of any system of ordinary differential equations into a power-law format

Stochastic Focusing

Stochastic focusing

Following

This system can be interpreted as follows: the intermediate species I is produced at constant rate _{1 }from some source _{4 }through the catalysis with signalling molecule S; the end product P is converted from species I at rate _{2 }and degrades at rate _{3}; the signalling molecule S is produced and degrades at rates _{5 }and _{6}, respectively. Moreover, the value of _{5 }is reduced to half at a certain time point to achieve a significant divergence effect. In order to capture the average dynamics of the system accurately, we use a power-law model in GMA format instead of the mass action rate law in

The system size is set to 1. We can see from equations (64) that two variables _{4}(_{I }f_{S }

Here _{I}_{P}_{S}^{T}

Moreover, for _{1}", ..., _{6}")^{T}_{1 }"⊙ _{6}"⊙ ^{T}_{1}', ..., _{6}'),

The stochastic focusing model without propensity adjustment yields results quite different from those of the deterministic model, as is illustrated in Figure ^{st }panel are predicted from the moment equations (65) and the blue error bars for _{P }^{nd }panel are obtained from ten independent stochastic simulations. Both diverge systematically from the black line predicted by ODE model (64). By contrast, the stochastic model with propensity adjustment produces results consistent with the deterministic model, as shown by the 4^{th }panel.

Stochastic focusing

**Stochastic focusing**. The first panel from the top compares the time evolution of product molecules _{P }_{P }_{P}_{I }f_{S }_{4 }achieves convergence between the stochastic simulation and the ODE model (64) (black line): the blue error bars were computed from 100 independent stochastic simulations with propensity adjustment _{4}. The simulation parameters are (_{1}, _{2}, _{3}, _{4}, _{5}, _{6}, _{I}_{S}^{4}, 10^{3}, 1, 9.9 × 10^{3}, 10^{4}, 10^{3}, 1.1, 0.9); at _{5 }changes from 10^{4 }to 0.5 ×10^{4}.

Discussion

It is often implicitly assumed that the rate of a dynamic process can be directly taken as the propensity for a corresponding stochastic process. We have shown here that this is sometimes, but not always, true. Our results fall into three categories. The first develops conditions for a valid conversion of a rate to a propensity, the second presents a general conversion procedure, and the third discusses computational issues of propensity adjustment.

Conditions for the direct use of a rate constant (function) as propensity function

We have shown that the direct use of a rate constant or a rate function

1) ^{th}-order and 1^{st}-order reaction kinetics.

2) the reaction is monomolecular; this assumptions was evaluated in the Results section describing real-valued order monomolecular reaction kinetics.

3) all _{i }

Each of these three conditions is a sufficient condition for the direct use of a rate function _{r}(**X**(_{i}_{j}_{i}_{j}_{i}_{i}_{i}_{j}

This result implies the following: If the covariance between every pair of random variables is zero (or ignorable), we have _{i}_{i}_{i}_{i}_{r}(**X**(_{r}(**X**(

If at least one of the three assumptions is satisfied, the stochastic simulation algorithm (SSA) is applicable without changes.

A general procedure for converting an equation-based model into a stochastic analogue

In the past, efforts have been made to manipulate the chemical master equation (CME) in order to achieve a proper propensity function for a reduced system (

To address the first question, we showed that the following steps are necessary:

(1) A concentration-based model needs to be converted into a particle-based model by accounting for the size of the system; if the concentration-based model is scaled (as was illustrated with the repressilator example), it may first have to be un-scaled in order to render the conversion valid;

(2) The difference between the mean of a stochastic model without propensity adjustment and the corresponding quantities of the equation-based model should be evaluated. The mean of the stochastic model is obtained either through stochastic simulations or through a moment-based approach. If the difference is significant, then an adjustment of the propensity function for a non-elementary reaction is necessary.

To answer the second question, we need to execute the following steps

(3) Compute a propensity adjustment function, either through simulated or experimental data or through a moment-based approach, in order to achieve the corrected propensity function (41);

(4) Apply SSA or one of its variants using a propensity function with adjustment to obtain valid simulation trajectories.

Computational issues of propensity adjustments

When the propensity needs adjusting, an accurate propensity adjustment function (

1) The expression of ^{nd}-order Taylor expansion in log space.

2) The moment-based approach, from which the functions of mean, variance and covariance are usually derived, is an approximation method that yields a closed ODE system for the moments. In the method used here, the propensity function is approximated by a 2^{nd}-order Taylor expansion, and the moments up to a certain degree (2 in our treatment) are retained, while all higher moments are assumed to be zero. One might expect that a higher-order Taylor expansion would improve the accuracy of

Since computation cost is a major concern with the stochastic simulation of large biochemical reaction networks, another issue has yet to be addressed. Namely, how does the propensity function of a reduced system affect the accuracy and efficiency of various leaping methods that have been proposed to speed up SSA? Moreover, the question of molecular population sizes requires further analysis. Our derivation assumed large reactant populations, but simulations of a reversible pathway indicated that the method works rather well even for small populations. A more careful investigation of this issue of population size in different scenarios is still needed and should be the subject of further research.

Conclusions

Gillespie's stochastic simulation algorithm (SSA), as well as later variants, permits three kinds of elementary reactions to be modelled: 0^{th}, 1^{st }and 2^{nd }order reactions that are assumed to follow the law of mass action. All other types of reactions, containing non-integer kinetic orders and/or following other types of kinetic law, are assumed to be convertible to one of these three kinds, so that SSA can validly be applied. However, the conversion to elementary reactions is often difficult, infeasible, or simply impossible. First, the kinetic parameters of the underlying elementary reactions are in many cases unknown for a complex-order reaction. Second, even when all elementary kinetic parameters are available, the multitude of reaction channels and participating species creates a combinatorial complexity that renders SSA simulations computationally impractical. Within a deterministic framework, model reduction is a possible and often-used strategy to address such challenges. For example, a reduced mechanistic model, such as the Michaelis-Menten rate law, is often proposed to fit the experimental data, at the cost of sacrificing the original mechanistic interpretation. The reduction in these cases simplifies the original formulation by approximating, merging, or omitting intermediate reaction steps and reactants.

In this article, we propose a rather general strategy for converting a deterministic process model into a corresponding stochastic model and characterize the mathematical connections between the two. The deterministic framework is assumed to be a generalized mass action system and the stochastic analogue is in the format of the chemical master equation. The analysis identifies situations: where a direct conversion is valid; where internal noise affecting the system needs to be taken into account; and where the propensity function must be mathematically adjusted. The conversion from deterministic to stochastic models is illustrated with several representative examples, including reversible reactions with feedback controls, Michaelis-Menten enzyme kinetics, a genetic regulatory motif, and stochastic focusing. The construction of a stochastic model for a biochemical network requires the utilization of information associated with an equation-based model. The conversion strategy proposed here guides a model design process that ensures a valid transition between deterministic and stochastic models.

Authors' contributions

JW developed the mathematical derivations, designed and performed the simulation, and drafted the manuscript. BV contributed to the statistical reasoning and revised the manuscript. EV supervised the research and revised the manuscript. All authors read and approved the final manuscript.

Acknowledgements

The authors thank Dr. Yi Jiang for useful comments and for providing seminal references. The authors also appreciate Dr. Mukhtar Ullah's and Dr. Olaf Wolkenhauer's insightful comments and conceptual clarifications. This work was supported in part by a Molecular and Cellular Biosciences Grant (MCB-0946595; E.O. Voit, PI) from the National Science Foundation. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the sponsoring institutions.