Department of Physics, Humboldt University of Berlin, Berlin, Germany

School of Biological Sciences, University of Edinburgh, Edinburgh, UK

SynthSys Edinburgh, University of Edinburgh, Edinburgh, UK

Abstract

Background

It is well known that the deterministic dynamics of biochemical reaction networks can be more easily studied if timescale separation conditions are invoked (the quasi-steady-state assumption). In this case the deterministic dynamics of a large network of elementary reactions are well described by the dynamics of a smaller network of effective reactions. Each of the latter represents a group of elementary reactions in the large network and has associated with it an effective macroscopic rate law. A popular method to achieve model reduction in the presence of intrinsic noise consists of using the effective macroscopic rate laws to heuristically deduce effective probabilities for the effective reactions which then enables simulation via the stochastic simulation algorithm (SSA). The validity of this heuristic SSA method is

Results

We here obtain, by rigorous means and in closed-form, a reduced linear Langevin equation description of the stochastic dynamics of monostable biochemical networks in conditions characterized by small intrinsic noise and timescale separation. The slow-scale linear noise approximation (ssLNA), as the new method is called, is used to calculate the intrinsic noise statistics of enzyme and gene networks. The results agree very well with SSA simulations of the non-reduced network of elementary reactions. In contrast the conventional heuristic SSA is shown to overestimate the size of noise for Michaelis-Menten kinetics, considerably under-estimate the size of noise for Hill-type kinetics and in some cases even miss the prediction of noise-induced oscillations.

Conclusions

A new general method, the ssLNA, is derived and shown to correctly describe the statistics of intrinsic noise about the macroscopic concentrations under timescale separation conditions. The ssLNA provides a simple and accurate means of performing stochastic model reduction and hence it is expected to be of widespread utility in studying the dynamics of large noisy reaction networks, as is common in computational and systems biology.

Background

Biochemical pathways or networks are typically very large. A well-characterized example is the protein-protein interaction network of the yeast

A common way of circumventing the problem is to simulate a network of species which is much smaller than the size of the full network but which nevertheless captures the essential dynamics. For example, the three elementary (unimolecular or bimolecular) reactions which describe the enzyme-assisted catalysis of substrate

On the macroscopic level, where molecule numbers are so large that intrinsic noise can be ignored, there is a well-known practical recipe for obtaining this reduced or coarse-grained network from the full network of elementary reactions. One writes down the rate equations (REs) for each species, decides which species are fast and slow, sets the time derivative of the concentration of the fast species to zero, solves for the steady-state concentrations of the fast species and finally substitutes these concentrations into the equations for the slow species. This procedure is the deterministic quasi-steady-state assumption (QSSA). The result is a set of new REs for the slow species only; corresponding to these reduced equations is the coarse-grained network, i.e., the network of reactions between slow species whose macroscopic rate laws are dictated by the new REs. Generally, all coarse-grained networks will have at least one reaction which is non-elementary; however those reactions involving the interaction of only slow species in the full network will naturally also remain elementary in the coarse-grained network. The deterministic QSSA presents a rigorous method of achieving a coarse-grained macroscopic description based on the deterministic REs

On the mesoscopic level, or, in other words, whenever the size of intrinsic noise becomes comparable with the average molecule numbers, the description of chemical kinetics is given by the CME. One would hope that under conditions of timescale separation, just as one can write effective REs for a coarse-grained network starting from the REs of the full network, in a similar manner one can obtain an effective (or reduced) CME for the coarse-grained network starting from the CME of the full network. The effective REs have information about the macroscopic concentrations of the slow species only, while the effective CME has information about the fluctuations of the slow species only. This line of reasoning has led to a stochastic formulation of the QSSA which is in widespread use. In what follows we concisely review the CME formulation of stochastic kinetics and point out compelling reasons which cast doubt on the validity of the popular stochastic QSSA.

Suppose the network (full or coarse-grained) under consideration consists of a number

Here, _{
i
}denotes chemical species _{
ij
}and _{
ij
} are the stoichiometric coefficients and _{
j
}is the macroscopic rate coefficient of the reaction. If reaction scheme (1) describes the full network with _{
s
} number of slow species and _{
f
}=_{
s
} number of fast species, then we adopt the convention that _{1}to
_{
N
} label the fast species. Let _{
i
}denote the absolute number of molecules of the _{
j
} for the

where
_{
j
}, i.e., the rate of reaction according to the deterministic REs, is also shown alongside the microscopic rate functions. Note that

Microscopic and macroscopic rate functions.

**Microscopic and macroscopic rate functions.** The macroscopic rate function

If we are modeling the full network, then the constituent reactions have to be all elementary. For such reactions, the propensity and microscopic rate functions have been derived from molecular physics

_{2}
_{5}; he showed that “the master equation for a complex chemical reaction cannot always be reduced to a simpler master equation, even if there are fast and slow individual reaction steps”

Notwithstanding the fundamental objections of Janssen, and frequently in the name of pragmatism, many studies

In this article we seek to derive a rigorous alternative to the heuristic approach. Given the CME of the full network of elementary reactions, we derive a reduced linear Fokker-Planck equation (FPE) which describes the noise statistics of the same network when the molecule numbers are not too small and under the same conditions of timescale separation imposed by the deterministic QSSA. This new FPE is the legitimate mesoscopic description of intrinsic noise about the macroscopic concentrations of the coarse-grained network as obtained by the deterministic QSSA. The noise statistics from this approach are compared with stochastic simulations of the full network and with simulations of the coarse-grained network using the conventional heuristic approach. In all cases our approach agrees very well with the full network results. In contrast, we show how the size of intrinsic noise as predicted by the conventional approach can be different by more than an order of magnitude than the actual value and how in some instances this approach even misses the existence of noise-induced oscillations. We also show using our method how one can obtain the regions of parameter space where the conventional approach qualitatively fails and where it fares well.

The article is organized as follows. In the Results section, we discuss in general terms the procedure of obtaining a rigorous mesoscopic description under conditions of timescale separation akin to those of the deterministic QSSA. We then apply this novel method to two different examples: an enzyme mechanism capable of displaying both Michaelis-Menten and Hill-type kinetics and a gene network with a single negative feedback loop. The results from our method are contrasted and compared with stochastic simulations of the full network and with those of the coarse-grained network using the conventional heuristic method. We finish by a discussion. Detailed derivations concerning results and applications can be found in the Methods section.

Results

The optimal method to determine the validity of the heuristic CME would be to obtain its analytical solution and compare it with that of the CME for the full network and for rate constants chosen such that the deterministic QSSA is valid. Note that the latter constraint on rate constants is necessary because the propensities of the heuristic CME are based on the macroscopic rate laws as given by the reduced REs and hence the heuristic CME can only give meaningful results if the deterministic QSSA is valid. Unfortunately, CMEs are generally analytically intractable, with exact solutions only known for a handful of simple elementary reactions

where
_{
i
}is proportional to the noise about this concentration. This substitution leads to an infinite expansion of the master equation. The first term, that proportional to Ω
^{1/2}, leads to the deterministic equations for the mean concentrations as predicted by the CME in the macroscopic limit of large volumes (or equivalently large molecule numbers). The rest of the terms give a time-evolution equation for the probability density function of the fluctuations,
^{0}, leads to a second-order partial differential equation, also called the linear Fokker-Planck equation or the linear noise approximation (LNA)

Hence we can now formulate two questions to precisely determine the validity of the heuristic CME in timescale separation conditions: (i) in the macroscopic limit, are the mean concentrations of the heuristic CME exactly given by the reduced REs obtained from the deterministic QSSA? (ii) are the noise statistics about these mean concentrations, as given by the LNA applied on the heuristic CME, equal to the noise statistics obtained from applying the LNA on the CME of the full network? If the heuristic CME is correct then the answer to both these questions should be yes.

The first question can be answered straightforwardly. The deterministic equations for the mean concentrations of the heuristic CME, in the macroscopic limit of infinite volumes, necessarily only depend on the macroscopic limit of the heuristic microscopic rate functions in the heuristic CME. More specifically, consideration of the first term of the system-size expansion leads to a deterministic set of equations of the form

- S

The second question, regarding agreement in noise statistics not simply in the means, has not been considered before and presents a considerably more difficult challenge. In what follows we briefly review the LNA applied to the heuristic CME of the coarse-grained network which we shall call the hLNA and we derive the LNA applied to the full network under conditions of timescale separation, a novel method which we refer to as the slow-scale LNA (ssLNA).

The LNA applied to the heuristic CME

The application of the LNA to the heuristic CME has been the subject of a number of studies

Given the coarse-grained network, reaction scheme (1), one can construct the stoichiometric matrix

- S

- J

It then follows by the LNA that the noise statistics given by the heuristic CME, i.e., equation (2) with heuristic propensities, in the limit of large molecule numbers, are approximately described by the following linear FPE

where
_{
s
}denotes the vector of derivatives with respect to components of the vector

- D

where

- F

The solution of the linear FPE, equation (4), is a multivariate Gaussian and hence noise statistics can be straightforwardly computed. The covariance matrix

- H

where _{
ij
}=〈_{
s,i
}
_{
s,j
}〉. The variance of the fluctuations of species

- H

where

- I

Note that we have chosen to compute the variance and power spectrum as our noise statistics for the following reasons. The variance can be used to calculate the Fano factor (variance of fluctuations divided by the mean concentration) and the coefficient of variation (standard deviation of fluctuations divided by the mean concentration)

The LNA applied to the full network under conditions of timescale separation

The LNA approach mentioned in the previous subsection works equally well if applied to the CME of the full network. This leads to a linear FPE of the form

where

- J

- D

- S

- F

Note that while equation (4) is based on the heuristic CME and therefore inherits all its problems, equation (8) has no such problems: it is derived from the CME of the full network of elementary reactions, which is fundamentally correct. Hence, ideally, we would obtain the multivariate Gaussian solution of the two linear FPEs, compare and then decide upon the validity of the heuristic CME. Unfortunately, this direct comparison is impossible because equation (4) gives a joint probability distribution function for the slow species only, whereas equation (8) leads to a joint probability density function for both slow and fast species.

In the Methods section we devise an adiabatic elimination method by which, starting from equation (8), we obtain a closed-form solution for a linear FPE that describes the time evolution of the joint probability density function of slow variables only. We call the reduced linear FPE obtained from this method, the slow-scale LNA. Our result can be stated as follows. Under conditions of timescale separation consistent with the deterministic QSSA, the noise statistics of the slow species according to the CME of the full network can be approximately described by the following linear FPE

Note that the matrix

- J

- D

- D

where

where

- S

- J

- F

- S

- J

- S

- S

- J

- J

- J

- J

- D

- D

The derivation of the ssLNA leads us to a fundamental conclusion:

where the _{
i
}(

Determining the validity of the heuristic CME.

**Determining the validity of the heuristic CME.** Scheme illustrating the analytical approach to determine the validity of the heuristic CME which is used in this article. Parameters are chosen such that the deterministic QSSA is valid and such that molecule numbers are not too small. The LNA is applied to the heuristic CME leading to a linear FPE, describing the noise of the slow species. A different reduced linear FPE describing the noise in the slow species is obtained by applying a rigorous adiabatic elimination method on the linear FPE which approximates the CME of the full network. The noise statistics from the two linear FPEs are compared

One may ask whether there is an effective CME which in the large volume limit can be approximated by the ssLNA, Eq. (12). The form of the noise coefficient in Eq. (12) implies that the ssLNA corresponds to the master equation of an effective reaction scheme with a stoichiometric matrix

Such a reaction scheme is compatible with the reduced REs: defining

- S

- A

- S

In the rest of this article, we apply the systematic comparison method developed in the Results section to two examples of biological importance: enzyme-facilitated catalysis of substrate into product by cooperative and non-cooperative mechanisms and a genetic network with a negative feedback loop. For each of these, we shall obtain the noise properties of the coarse-grained versions of the circuits in the limit of large molecule numbers using the ssLNA and the hLNA. Because the expressions for the noise statistics from these two are quite simple, we shall be able to readily identify the regions of parameter space where the hLNA, and hence the heuristic CME, is correct and where it gives misleading results. The theoretical results are confirmed by stochastic simulations based on the CME of the full network and on the heuristic CME of the coarse-grained network.

Application I: Cooperative and non-cooperative catalytic mechanisms

Many regulatory mechanisms in intracellular biochemistry involve multisubunit enzymes with multiple binding sites

Substrate

Deterministic analysis and network coarse-graining

The full network (14) (without the input reaction) has been previously studied using REs by Tyson

where ^{
′
}is an effective first-order rate coefficient. The Michaelis-Menten constants are

Note that the deterministic QSSA has reduced our network from one with 5 species interacting via 7 elementary reactions, reaction scheme (14), to one with 2 species interacting via 2 reactions, one elementary and one non-elementary, reaction scheme (17). A cartoon representation of the two networks can be found in Figure

Full and coarse-grained mechanisms of a two-subunit enzyme network.

**Full and coarse-grained mechanisms of a two-subunit enzyme network.** Cartoon illustrating the full and coarse-grained networks for the two-subunit enzyme network. The reduced, coarse-grained network is obtained from the full network under conditions of timescale separation, i.e., transients in the concentrations of all enzyme and complex species decay much faster than transients in the concentrations of the substrate and product species

Note that throughout the rest of this article, the notation [

Stochastic analysis of the coarse-grained network: ssLNA and hLNA methods

We use the ssLNA (see the Results section) to obtain the Langevin equation for the intrinsic noise _{
s
}(

where _{1} and _{2} are defined as

Note that Γ_{
i
}(_{1}(_{
in
}, Γ_{2}(_{1} and so on for the rest of the noise terms. Hence, we see that according to the ssLNA,

Next we obtain the variance of the substrate fluctuations by applying the LNA to the heuristic CME of the coarse-grained network (hLNA). The heuristic microscopic rate functions, i.e., the propensities divided by the volume, are in this case

Using the prescription for the hLNA (see the Results and Methods sections), one obtains the variance of the fluctuations to be

A comparison of equations (21) and (24) leads one to the observation that the latter can be obtained from the former by setting _{1}=_{2}=0. Substituting these conditions in the Langevin equation, equation (18), we obtain physical insight into the shortcomings of the conventional heuristic method. This method rests on the incorrect implicit assumption that

Stochastic Michaelis-Menten and Hill-type kinetics

We now consider two subcases which are of special interest in biochemical kinetics: (i) _{2}→0, _{
m2}→_{2}→_{
m2}→0 at constant

Hence, the first case leads to Michaelis-Menten (MM) kinetics (non-cooperative kinetics) and the second to Hill-type kinetics with a Hill coefficient of two (cooperative kinetics).

Applying limit (i) to equations (21) and (24), we obtain the variance of the fluctuations for Michaelis-Menten kinetics as predicted by the ssLNA and the hLNA

Similarly applying limit (ii) to equations (21) and (24), we obtain the variance of the fluctuations for Hill-type kinetics as predicted by the ssLNA and the hLNA

Comparison of equations (27) and (28) shows that the

The results for Hill-type kinetics are shown in Figure
_{S} and the Fano factor FF_{S} of the substrate concentration fluctuations (as predicted by equations (29) and (30)) versus the non-dimensional fraction

Noise statistics for cooperative two-subunit enzyme network.

**Noise statistics for cooperative two-subunit enzyme network.** Plots showing the Fano factor multiplied by the volume, ΩFF_{S}, (**a**) and the coefficient of variation squared,
**b**) for the substrate fluctuations as a function of the non-dimensional fraction _{1}: 5×10^{−3}(yellow), 5×10^{−5}(purple), and 5×10^{−7}(blue). The remaining parameters are given by
_{2}=1000, _{−1}=_{−2}=100, _{3}=_{4}=1. Note that in (a) the black dashed line indicates the hLNA prediction for all three different values of _{1}, which are indistinguishable in this figure. The stochastic simulations were carried out for a volume Ω=100. In (**c**) sample paths of the SSA for the full network (gray), the slow scale Langevin equation (red) as given by equation (18) and the SSA with heuristic propensities (blue) are compared for

Validity of the deterministic QSSA for the cooperative two-subunit enzyme network.

**Validity of the deterministic QSSA for the cooperative two-subunit enzyme network.** Plot of the macroscopic substrate concentration [

The following observations can be made from Figures
_{1}=5×10^{−7} and _{S} and FF_{S} from the hLNA are approximately 11 and 112 times smaller, respectively, than the prediction of the ssLNA. In Figure

Besides quantitative disagreement we also note that the qualitative dependence of the FF_{S} and the CV_{S} with _{1}=5×10^{−7}, according to stochastic simulations of the full network and the ssLNA, the FF_{S} reaches a maximum at _{S}with _{S}which is much greater than 1, whereas the heuristic approach predicts ΩFF_{S} which is below 1. Hence, for

The power spectrum for the substrate fluctuations has also been calculated (see the Methods section). Although there is some quantitative disagreement between the predictions of the ssLNA and hLNA both are in qualitative agreement: the spectrum is monotonic in the frequency and hence no noise-induced oscillations are possible by this mechanism. More generally, it can be shown that the spectra of the hLNA and ssLNA are in qualitative agreement for all full networks with at most one slow species because as can be deduced from equation (7), for such networks, the spectrum for a single species chemical system is invariably monotonic in the frequency.

Application II: A gene network with negative feedback

Finally, we study an example of a gene network with autoregulatory negative feedback. Such a feedback mechanism is ubiquitous in biology appearing in such diverse contexts as metabolism

We consider the following prototypical gene network. For convenience, we divide the network into two parts: (i) the set of reactions which describe transcription, translation and degradation, and (ii) the set of reactions which constitute the negative feedback loop. The first part is described by the reactions

The mRNA,

Note that the gene with two bound proteins is inactive, in the sense that it does not lead to mRNA production. This implies that sudden increases in protein concentration lead to a decrease in mRNA transcription which eventually results in a lowered protein concentration; this is the negative feedback or auto-inhibitory mechanism. The reaction network as given by reaction schemes (31) and (32) is our full network for this example. Note that the first two reactions in reaction scheme (31) are not in reality elementary chemical reactions but they are the simplest accepted forms of modeling the complex processes of transcription and translation and hence it is in this spirit that we include them in our full network description.

Deterministic analysis and coarse-grained network

Model reduction on the macroscopic level proceeds by applying the deterministic QSSA to the REs of the full network (see the Methods section for details). The fast species are the enzyme, _{2}. The slow species are the mRNA, _{2}→_{1}→0 at constant _{2}
_{1}; this enforces cooperative behavior since the binding of

where ^{2}=_{−1}
_{−2}/_{1}
_{2},

Full and coarse-grained mechanisms of a gene network.

**Full and coarse-grained mechanisms of a gene network.** Cartoon illustrating the full and coarse-grained networks for the gene network with a single negative feedback loop. The reduced, coarse-grained network is obtained from the full network under conditions of timescale separation, i.e., transients in the concentrations of all enzyme, enzyme complex, gene and gene complex species decay much faster than transients in the concentrations of mRNA and protein

Stochastic analysis of the coarse-grained network: ssLNA and hLNA methods

We denote _{
s,1}and _{
s,2} as the fluctuations about the concentrations of mRNA and of protein, respectively. The ssLNA leads to reduced Langevin equations of the form

where Γ_{
i
}(_{2}, _{2}→_{
ij
} denotes the

- J

where _{3}=_{−3}/_{3}. Note that the coupled Langevin equations (34) imply that the fluctuations in the mRNA and protein concentrations are affected by noise from all of the 11 constituent reactions of the full network (reaction schemes (31) and (32)) except from those of the reversible reaction _{2}. As shown in the Methods section, the noise from this reaction becomes zero due to the imposition of cooperative behavior in the feedback loop.

The covariance matrix for the fluctuations of the Langevin equations (34) is given by the Lyapunov equation (6) with Jacobian being equal to that of the reduced REs (33) and diffusion matrix

- D

- D

It is also possible to calculate the covariance matrix of the fluctuations of the slow variables using the hLNA (see the Methods section). This is given by a Lyapunov equation (6) with Jacobian being equal to that of the reduced REs (33) and diffusion matrix

- D

A comparison of equation (37) and equation (38) shows that the ssLNA and hLNA are generally different except in the limits of _{1}→0 and _{1}=0 implies ignoring the noise due to the reversible reaction

Furthermore, by the comparison of equations (37) and (38), one can also deduce that the heuristic CME provides a statistically correct description when the protein concentration [

Detailed comparison of the noise statistics from the ssLNA and hLNA

Figure
_{0}. These are obtained by solving the two Lyapunov equations mentioned in the previous subsection for the covariance matrix; the variances are then the diagonal elements of this matrix, from which one finally calculates the Fano factors and the coefficients of variation. The values of rate constants are chosen such that we have timescale separation conditions (see Figure
_{3}=1 and _{0}>50, the predictions of the ssLNA are approximately 3 orders of magnitude larger than those of the hLNA (and of stochastic simulations using the heuristic CME).

Noise statistics of the gene network.

**Noise statistics of the gene network.** Dependence of the Fano factor (**a**) and of the coefficient of variation squared (**b**) of the protein fluctuations on the rate of transcription _{0}, according to the ssLNA (solid lines) and the hLNA (dashed lines). The noise measures are calculated for three values of the bimolecular constant _{3}=1(yellow), _{3}=0.1(purple), _{3}=0.01(blue). All other parameters are given by
_{1}=10^{−5}, _{2}=100, _{−1}=_{−2}=_{−3}=10, _{4}=_{s}=_{dM}=1. Stochastic simulations of the full networks (solid circles) and of the coarse-grained network (open circles) using the CME and the heuristic CME, respectively, were performed for a volume of Ω=100. Note that at this volume there is one gene and 100 enzyme molecules. Note also that the chosen parameters guarantee timescale separation (validity of the deterministic QSSA) and cooperative behavior in the feedback loop (see Figure

Validity of the deterministic QSSA for the gene network.

**Validity of the deterministic QSSA for the gene network.** Plot of the macroscopic substrate concentrations of mRNA, [_{0}is 50. The excellent agreement between the two RE solutions, implies timescale separation conditions

Finally, we investigate the differences between the predictions of the ssLNA and hLNA for noise-induced oscillations in the mRNA concentrations. These are oscillations which are predicted by CME based approaches but not captured by RE approaches. In particular, these noise-induced oscillations occur in regions of parameter space where the REs predict a stable steady-state

- D

- D

- D

- D

- D

Noise-induced oscillations in the gene network.

**Noise-induced oscillations in the gene network.** Comparison of the predictions of noise-induced oscillations in the mRNA concentrations by ssLNA and hLNA methods. Panel (**a**) shows a stochastic bifurcation diagram depicting the regions in the translation rate (_{s}) versus transcription rate (_{0}) parameter space where both methods predict no oscillations (black), both predict oscillations (red) and only the ssLNA correctly predicts an oscillation (blue). There is no steady-state in the white region. Panels (**b**), (**c**) and (**d**) show spectra at 3 points in the blue, red and black regions of the bifurcation plot in **(a)** (these points are marked by roman numbers). The solid and dashed lines show the predictions of the ssLNA and the hLNA respectively, while the dots and circles show the results of stochastic simulations of the full and coarse-grained network using the CME and the heuristic CME, respectively. The parameters are given by Ω=1000,
_{dM}=0.01, _{1}=0.001, _{−1}=100, _{2}=1000, _{−2}=1, _{−3}=10, _{3}=0.1, _{4}=10. These parameters guarantee timescale separation (validity of the deterministic QSSA) and cooperative behavior in the feedback loop. Note that the hLNA spectrum in (**b**) and (**c**) is scaled up 5000 and 1000 times, respectively

We emphasize that the main message brought by our analysis is that there are significantly large regions of parameter space (blue region in Figure

Qualitative discrepancies in the prediction of noise-induced oscillations arise because the hLNA does not correctly take into account the fluctuations stemming from the rate limiting step of the cooperative binding mechanism. The latter involves the slow binding reaction between a protein molecule

Discussion and conclusion

Concluding, in this article we have rigorously derived in closed-form, linear Langevin equations which describe the noise statistics of the fluctuations about the deterministic concentrations as predicted by the reduced REs obtained from the deterministic QSSA. Equivalently, the ssLNA, as the method was called, is the statistically correct description of biochemical networks under conditions of timescale separation and sufficiently large molecule numbers. We note that our method provides an accurate means of performing stochastic simulation in such conditions. This is particularly relevant since it has been proven that there is generally no reduced CME description in such cases

The limitations of the ssLNA are precisely those of the conventional LNA on which it is based. Namely, if the system is composed of at least one bimolecular reaction, then it is valid for large enough molecule numbers (or, equivalently, large volumes) and provided the biochemical network is monostable. If the system is purely composed of first-order reactions and if one is only interested in variance and power spectra, then the only requirement is that of monostability. This is since in such a case it is well known that the first and second moments are exactly given by the LNA. For monostable systems with bimolecular reactions, the finite-volume corrections to the LNA can be considerable when the network has implicit conservation laws, when bursty phenomena are at play and when steady-states are characterized by few tens or hundreds of molecules

A necessary and sufficient condition for timescale separation is that the timescales governing the decay of the transients in the average concentrations are well separated. Fast species are those whose transients decay on fast timescales while the slow species are those whose transients decay on slow timescales. At the microscopic level, there are several different scenarios which can lead to timescale separation. Grouping chemical reactions as fast or slow according to the relative size of their associated timescales, Pahlajani

- B

These results are in line with those of Mastny

Finally we consider the approach of Shahrezaei and Swain

While the stochastic simulation algorithm explicitly simulates every individual reaction event, the Langevin approach yields approximate stochastic differential equations for the molecular populations. This is computationally advantageous whenever the reactant populations are quite large

We emphasize that besides deriving the ssLNA method, in this paper we have used it to determine the range of validity of the conventional heuristic CME approach and the size of errors in its predictions. To our knowledge, this is the first study which attempts to answer these important and timely questions via a rigorous, systematic theoretical approach.

Our main message is that, the “conventional wisdom” that the heuristic CME is generally a good approximation to the CME of the full network under conditions of timescale separation is incorrect, if one is interested in intrinsic noise statistics and the prediction of noise-induced oscillations.

Methods

Derivation of the ssLNA

The linear FPE describing the full network is given by equation (8). It is well known that with every FPE one can associate a set of Langevin equations (stochastic differential equations)

The set of coupled Langevin equations equivalent to equation (8) are

Note that the time-dependence of the matrices in the above equations comes from that of the macroscopic concentrations of fast and slow species. Now say that we impose timescale separation conditions, i.e., the correlation time of fast fluctuations, _{
f
}, is much smaller than the correlation time of slow fluctuations, _{
s
}. We wish to obtain a reduced description for the fast fluctuations, i.e., for equation (39), on timescales larger than _{
f
} but much smaller than _{
s
}. On such timescales, transients in the macroscopic concentrations of fast species have decayed, a quasi-steady-state is achieved and by the deterministic QSSA, we know that the fast-species concentrations can be expressed in terms of those of the slow-species concentrations. Now the latter concentrations vary very slowly over timescales much smaller than _{
s
}implying that for all intents and purposes they can be considered constant. Hence the matrices in equation (39) can be considered time-independent. It then follows that the solution to the latter equation is approximately given by

where we have put
_{0}→−

where

- I

Since we are interested in a description on timescales larger than _{
f
}, i.e., for fluctuations of frequency

Taking the inverse Fourier transform of the above equation and substituting in equation (40) we obtain

This Langevin equation is the ssLNA: it is an effective stochastic description of the intrinsic noise in the slow variables in timescale separation conditions. Using standard methods

A note on the reduced Jacobian of the ssLNA

Here we show that the reduced Jacobian

with slow and fast perturbations

Hence the Jacobian in the ssLNA equations (9) and (12) is the same as the Jacobian of the reduced REs.

Note that equations (48) are formally the same as obtained by taking the average of the LNA equations (39) and (40) (this general agreement between the LNA and linear stability analysis is discussed in

Details of the derivations for the two-subunit enzyme network

The ssLNA recipe: Langevin equation and noise statistics

We here show the details of the ssLNA method as applied to the network discussed in Application I in the Results section. The first step of the recipe is to cast the reaction scheme of the full network (14) into the form of the general reaction scheme (1). This is done by setting _{1}=_{2}=_{3}=_{4}=

Note that the row number of the stoichiometric matrix reflects the species number, while the column number reflects the reaction number. The order of the entries in the macroscopic rate function vector reflects the reaction number.

The enzyme can only be in one of three forms, _{
T
}] is the total enzyme concentration, which is a time-independent constant. Hence, we are free to remove information from the stoichiometric matrix about one of the enzyme forms; we choose to remove information about

Note that we have also partitioned the stoichiometric matrix into two sub-matrices as required by our method (see prescription for ssLNA in the Results section). Now we can use this matrix together with the macroscopic rate function vector

where we also partitioned the matrix into 4 sub-matrices as required by our formulation of the ssLNA in the Results section. Now we can use the two sub-matrices of the stoichiometric matrix and the four sub-matrices of the Jacobian to calculate the matrix

- A

- B

- A

- B

where _{1}and _{2} are as defined in the main text by equations (19) and (20). Furthermore, the Jacobian of the reduced RE, equation (15) in the main text, is given by

Finally, the Langevin equation, equation (18), is obtained by substituting equations (55) and (54) in equation (12). The equation for the variance of the substrate fluctuations, equation (21), is obtained by substituting equation (54) in equation (10) to obtain the new diffusion scalar _{
ss
} and then substituting the latter together with the new Jacobian equation (55) in the Lyapunov equation, equation (6), with _{
h
} replaced by _{
ss
}. Note that in this example because we have only one slow species, the Lyapunov equation is not a matrix equation but simply a single linear algebraic equation for the variance. For the same reason we have a diffusion scalar rather than a diffusion matrix. The power spectrum can be obtained by substituting the new Jacobian and diffusion scalar in equation (7) (with _{
h
} replaced by _{
ss
}), leading to

The power decays monotonically with frequency, which implies no noise-induced oscillation; this statement is generally true for all networks (full or coarse-grained) which have just one slow species.

The hLNA recipe: Langevin equation and noise statistics

Here we apply the LNA to the heuristic CME according to the method described in the Results section. The coarse-grained network is given by reaction scheme (17); an elementary reaction for the substrate input process and a non-elementary first-order reaction for substrate catalysis. The stoichiometric matrix and macroscopic rate function vector are given by

where ^{
′
} is defined in the main text, equation (16). The diffusion scalar _{
h
}of the linear FPE approximating the heuristic master equation for this process can be constructed from the stoichiometric and macroscopic rate function matrices using equation (5), which leads to

In the Results section, it was shown that a reduced CME description becomes possible whenever the effective stoichiometric matrix

- S

- S

Details of the derivations for the gene network example

Reduced rate equations

The fast species of the genetic network with negative feedback given in the main text are given by the gene species _{2} and the enzyme species _{2} read

Substituting the gene conservation law, setting the time derivatives to zero and solving these two equations simultaneously, we obtain the quasi-steady-state concentrations of the three gene species

where _{1}=_{−1}/_{1}, _{2}=_{−2}/_{2} and ^{2}=_{1}
_{2}. Since only the ternary complex (one with 3 molecules, i.e., _{2}) does not lead to mRNA production, the active gene fraction is given by

where in the last step we have drawn the limit of cooperative binding _{2}→0 at constant _{2}→_{1}→0 at constant _{1}
_{2}). It follows that the REs for the slow variables of mRNA and protein concentrations are then given by

where

Derivation of the ssLNA results

We cast the species in the full network (as given by reaction schemes (31) and (32)) into the form required by the convention set in the Introduction. We denote the slow species by _{1}=_{2}=_{3}=_{4}=_{2} and _{5}=_{2}, _{2}→

The stoichiometric matrix and the macroscopic rate function vector are constructed as

Note that the columns of

- S

From

- S

where the individual submatrices read explicitly

Using these Jacobian submatrices, the stoichiometric submatrices given in equation (65) and the diagonal matrix

- F

- A

- B

where

Note that _{3}=_{−3}/_{3}. The Jacobian can be obtained from the reduced REs, equation (64), and is given by

Note that we have drawn the limit of cooperative binding on _{1}, _{2},

- J

Finally the Langevin equation is obtained by substituting equations (76) and (72) in equation (12). To obtain the equation for the variance of the mRNA and protein fluctuations, one must first determine the diffusion matrix

- D

The covariance matrix equation can then be obtained by substituting the new diffusion matrix, equation (77), together with the Jacobian matrix, equation (76), in the Lyapunov equation (6) with

- D

- D

- H

where Det

- J

- J

- J

- D

- D

It can be shown that the condition to observe a peak in the mRNA power spectrum is given by

Derivation of the hLNA results

An inspection of the reduced REs, equations (64), shows that the coarse-grained network is composed of 4 reactions, two elementary and two non-elementary with a stoichiometry matrix and a macroscopic rate function vector given by

where we denoted the mRNA as species 1 and the protein as species 2. These can be used to calculate the diffusion matrix of the hLNA using equation (5), which leads to

The covariance matrix and the spectra can be obtained as for the ssLNA. The variances and spectra are given by equation (78) and equation (79) with _{
M
}replaced by _{
h,M
}, and _{
P
} replaced by _{
h,P
}.

In the main text, we show that the hLNA (and hence the heuristic CME) is the correct stochastic description under timescale separation when _{1}=0and _{1}=0 and

- S

Comparison with other stochastic model reduction methods

In this section, we compare the predictions of the ssLNA with the predictions of other stochastic model reduction techniques in the literature. Specifically, we compare with the recent methods of Pahlajani

We consider a simple model of stochastic gene expression given by

which describes transcription, translation and degradation of mRNA and protein. The deterministic REs for this example read

In the common case where the mRNA timescale is very small compared to that of protein, i.e.,

where the parameter _{
s
}/_{dM}has been interpreted as the burst size (the average number of proteins synthesized per mRNA transcript)

Shahrezaei and Swain

The ssLNA gives the following Langevin equation description of the system

The steady state variance predicted by the above Langevin equation is given by equation (87). The same result has also been previously obtained by Paulsson

Recently, another approximate reduction technique based on the LNA has been proposed by Pahlajani, Atzberger and Khammash

where

- D

The authors showed that the application of this formalism to the gene example above, leads to a Langevin equation of the form

The variance of fluctuations predicted by the above Langevin equation is given by equation (87) with

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

PT developed the mathematical formulation of the ssLNA and performed the stochastic simulations to corroborate its predictions. AVS contributed to the interpretation of the derivations, in particular to the clarification of issues concerning timescale separation. RG supervised the research, contributed to the derivation of the implicit assumptions of the hLNA and to derivations concerned with the Langevin formulation of the ssLNA, and wrote the manuscript. All authors read and approved the final manuscript.

Acknowledgements

This work was supported by the German Research Foundation (DFG project No. STR 1021/1-2) and by SULSA (Scottish Universities Life Science Alliance), both of which are gratefully acknowledged.