Background
Biochemical systems with small numbers of interacting components have increasingly been studied in recent years, as they are some of the most basic systems in cell biology
1
2
3
. Stochastic effects can strongly influence the dynamics of such systems. Applying deterministic ordinary differential equation (ODE) models to them, which approximate particle numbers as continuous concentrations, can lead to confusing results
4
5
. In some cases, even systems with large populations cannot be accurately modelled by ODEs. For instance, when close to a bifurcation regime, ODE approximations cannot reproduce the behaviour of the system for some parameter values
6
. Stochastic systems can be modelled using discrete Markov processes. The density of states of a wellstirred stochastic chemical reaction system at each point in time is given by the chemical master equation (CME)
7
8
. The stochastic simulation algorithm (SSA)
9
is an exact method for simulating trajectories of the CME as the system evolves in time.
The SSA can be computationally intensive to run for realistic problems, and alternative methods such as the τleap have been developed to improve performance
10
. Instead of simulating each reaction, the τleap performs several reactions at once, thus ‘leaping’ along the history axis of the system. This means that, unlike the SSA, the τleap is not exact; accuracy is maintained by not allowing too many reactions to occur per step. The size of each timestep, τ, determines the number of reactions occurring during that step, given by a Poisson random number.
This gain in speed must be balanced with loss of accuracy: larger steps mean fewer calculations but reduced accuracy. Many common τleap implementations employ a variable stepsize, as using the optimal stepsize τ at each point is crucial for the accuracy of the method
10
11
12
. However a fixedstep implementation can be useful in some cases. Although it may be less efficient, it is much easier to implement than variablestep equivalents. More importantly, the extrapolation framework that we describe in this paper requires a fixedstep method.
The original τleap as described by Gillespie
10
is known as the Euler τleap, as it can be compared to the Euler method for solving ODEs. It has been shown to have weak order of convergence one under both the scaling
τ
→
0
(traditional scaling)
13
14
and
V
→
∞
(large volume scaling)
15
, where V is the volume of the system. In the same paper, Gillespie also proposed the midpoint τleap method
10
, which has higherorder convergence in some cases
15
16
. Tian and Burrage
17
proposed a variant known as the Binomial τleap method that avoids issues with chemical species becoming negative. Only recently has more work been done on constructing higherorder stochastic methods. One such method is the randomcorrected τleap
18
, where at each timestep a random correction is added to the Poisson random number that determines the number of reactions in that step. Given a suitable random correction, the lowest order errors on the moments can be cancelled. In this way methods with up to weak order two convergence for both mean and covariance have been constructed. More recently, Anderson and Koyama
19
and Hu et al.
20
proposed another weak secondorder method, the θtrapezoidal τleap, which is an adaptation of the stochastic differential equation (SDE) solver of Anderson and Mattingly
21
for the discrete stochastic case.
In this paper we introduce a framework for improving the order of accuracy of existing fixedstep stochastic methods by using them in conjunction with Richardson extrapolation, a wellknown technique for improving the order of accuracy of a numerical solver by combining sets of simulations with different stepsizes
22
. Extrapolation was originally developed for ODE solvers but has also been successfully applied to SDE methods
23
. Our approach has three main advantages:
(1) It increases the order of accuracy of the methods supplied to it. This is desirable for the obvious reason that the resulting solutions are more accurate, as well as that larger timesteps can be used to reach a certain level of accuracy, reducing the computational cost. This is discussed further in our Conclusions.
(2) It can be applied to any fixedstep solver, for instance inherently higherorder methods such as the θtrapezoidal τleap or methods with an extended stability region such as stochastic RungeKutta methods [24].
(3) The resulting higherorder solutions can be extrapolated again to give solutions with even higher order, as there is no (theoretical) limit on the number of times a method can be extrapolated (although statistical errors can obscure the results if the method is too accurate  see Section Monte Carlo error).
Our extrapolated methods may be useful for researchers in biology and biochemistry, as they are easy to implement and can accurately and quickly simulate discrete stochastic systems that could otherwise be too computationally intensive.
We show how the extrapolation framework can be applied to fixedstep stochastic algorithms using the examples of the fixedstep Euler τleap, midpoint τleap
10
and θtrapezoidal τleap
20
methods. The extrapolation procedure depends heavily on the the existence of an appropriate global error expansion for the weak error of the numerical method. Once this is known, extrapolation consists of simple arithmetic. We calculate such an expansion for an arbitrary weak firstorder method; this allows us to use extrapolation in order to obtain higherorder solutions. The weak order of all the moments of such methods can be improved by extrapolation. To reinforce this, we perform a simple error analysis by comparing the equations for the true and numerical mean of the Euler τleap method; we see that its global error is order one, and extrapolating it increases the order to two for the case of zerothorder and firstorder reactions. Using numerical simulations, we demonstrate that this is true for two firstorder and three higherorder test systems with the Euler, midpoint and θtrapezoidal τleap methods. Moreover, the extrapolated methods have consistently lower errors, and in many cases visibly higherorder convergence in the first two moments (the lack of convergence in some of the simulations is discussed in Section Monte Carlo error). Finally, we demonstrate that the extrapolation framework can be used to give even higherorder numerical solutions by applying a second extrapolation to the Euler τleap method.
The rest of this paper is organized as follows. We begin with an overview of the SSA and the τleap methods we will use later. We then discuss Richardson extrapolation for ODEs and SDEs and introduce the extrapolated discrete stochastic framework. We give numerical results to support our claims that extrapolation reduces the error of fixedstep methods. Finally, we discuss the Monte Carlo error and give our conclusions. The derivations of the global error expansions for SDEs and discrete stochastic methods and related material are presented in the Appendix.
Overview of stochastic simulation methods
SSA
Gillespie’s SSA
9
is a statistically exact method for simulating paths from a Markov jump process. The two basic assumptions of the SSA are (i) that individual molecules are not explicitly tracked, and (ii) there are frequent nonreactive collisions. Thus we assume that the system is wellmixed and homogeneous.
The SSA simulates a system of biochemical reactions with N species and M reactions, interacting inside a fixed volume V at constant temperature. The populations of chemical species (as molecular numbers, not concentrations) at time t are represented as a state vector
x
≡
X
(
t
)
≡
(
X
1
,
…
,
X
N
)
T
. Reactions are represented by a stoichiometric matrix
ν
j
≡
(
ν
1
j
,
…
,
ν
Nj
)
T
, where j = 1,…,M, composed of M individual stoichiometric vectors. Each stoichiometric vector represents a reaction j occurring and the system changing from state x to x + _{
ν
j
}. Each reaction occurs in an interval t
t + τ) with relative probability a
_{
j
}(x)dt, where a
_{
j
} is the propensity function of the jth reaction. Propensity functions are given by the massaction kinetics of the reactant chemical species. For more detail, the reader is referred to Ref.
9
. The variables X,
ν
_{
j
} and a
_{
j
}(X) fully characterise the system at each point in time.
Algorithm 1. SSA Direct Method With the system in state X
_{
n
} at time t
_{
n
}
:
1. Generate two random numbers r1and r2from the unitinterval uniform distribution
U
(
0
,
1
)
.
2. Find the time until the next reaction
τ
=
1
a
0
ln
1
r
1
, where
a
0
(
X
n
)
=
∑
j
=
1
M
a
j
(
X
n
)
.
3. Find next reaction j from
∑
m
=
1
j
−
1
a
m
(
X
n
)
<
a
0
r
2
≤
∑
m
=
j
M
a
m
(
X
n
)
.
4. Update t
_{
n + 1} = t
_{
n
} + τ and X
_{
n+ 1} = X
_{
n
} +
ν
_{
j
}.
The Direct Method requires two newlygenerated random numbers at each timestep. Although there are other SSA implementations, such as the Next Reaction Method
25
and the Optimised Direct Method
26
, which can be more economical, in general the SSA is computationally costly.
τleap method
The τleap algorithm leaps along the history axis of the SSA by evaluating groups of reactions at once
10
. This means significantly fewer calculations, i.e. shorter computational time, per simulation, but simulation accuracy is compromised: we do not know exactly how many reactions occurred during each time step, nor can we tell more precisely when each reaction occurs than in which timestep. The leap condition defines an upper bound for the size of each timestep τ: it must be so small that the propensities do not change significantly for its duration, i.e. the change in state from time t to t + τ is very small
10
. Since τ is small, the probability a(x)τ that a reaction occurs during t
t + τ) is also small, so the number of times K
_{
j
}each reaction fires over one timestep can be approximated by
P
(
a
j
(
x
)
τ
)
, a Poisson random variable with mean and variance a
_{
j
}(x)τ
10
. The Euler τleap algorithm is the basic τleap method, and corresponds to the Euler method for solving ODEs or the EulerMaruyama method for solving SDEs.
Algorithm 2. Euler τleap method With the system in state X
_{
n
}
at time
t
_{
n
}
, and a timestep τ:
1. Generate M Poisson random numbers
k
j
=
P
(
a
j
(
X
n
)
τ
)
.
2. Update t
_{
n + 1} = t
_{
n
} + τ and
X
n
+
1
=
X
n
+
∑
j
=
1
M
ν
j
k
j
.
The Euler τleap has weak order one
13
14
15
. Although considerable work has been done on improving the mechanism for selecting the timesteps τ
10
11
12
and eliminating steps that would result in negative populations
17
27
28
29
, this does not affect the order of the method, limiting its accuracy. Methods with higher order are the only way to improve the accuracy beyond a certain point. Realising this, Gillespie also proposed a higherorder τleap method, the midpoint τleap
10
. This is similar to the midpoint method for ODEs, where at each step an estimate is made of the gradient of X at t
_{
n
} + τ/2. X
_{
n
} is then incremented using this extra information to give a more accurate approximation.
Algorithm 3. Midpoint τleap method With the system in state X
_{
n
}
at time
t
_{
n
}
, and a timestep
τ:
1. Calculate
X
′
=
X
n
+
1
2
τ
∑
j
=
1
M
ν
j
a
j
(
X
n
)
.
2. Generate M Poisson random numbers
k
j
=
P
(
a
j
(
X
′
)
τ
)
.
3. Update t
_{
n + 1} = t
_{
n
} + τ and
X
n
+
1
=
X
n
+
∑
j
=
1
M
ν
j
k
j
.
Although under the scaling
τ
→
0
the midpoint τleap has the same order of accuracy in the mean as the Euler τleap method, under the large volume scaling it has weak order two
15
16
. Our numerical simulations also suggest that it gives higherorder approximations to the first two moments for both linear and nonlinear systems (although this is not clear from the literature). However the local truncation error of its covariance is firstorder
16
.
θtrapezoidal τleap method
Based on the SDE method of Anderson and Mattingly
21
, the θtrapezoidal τleap
20
is a weak secondorder method. It consists of two steps, a predictor step with size θτ and a corrector step with size (1 − θ)τ that aims to cancel any errors made in the first step.
Algorithm 4. θtrapezoidal τleap method For a specified θ ∈ (0,1),
α
1
=
1
2
(
1
−
θ
)
θ
,
α
2
=
(
1
−
θ
)
2
+
θ
2
2
(
1
−
θ
)
θ
. With the system in state X
_{
n
}
at time
t
_{
n
}
, and a timestep
τ:
1. Generate M Poisson random numbers
k
j
′
=
P
(
a
j
(
X
n
)
θτ
)
,
j
=
1
,
…
,
M
.
2. Calculate predictor step
X
′
=
X
n
+
∑
j
=
1
M
ν
j
k
j
′
.
3. Calculate
l
j
=
max
α
1
a
j
(
X
′
)
−
α
2
a
j
(
X
n
)
,
0
.
4. Generate M Poisson random numbers
k
j
=
P
(
l
j
(
1
−
θ
)
τ
)
,
j
=
1
,
…
,
M
.
5. Update t
_{
n + 1} = t
_{
n
} + τ and
X
n
+
1
=
X
′
+
∑
j
=
1
M
ν
j
k
j
.
Specifically, the θtrapezoidal τleap method was shown to have weak order of convergence two in the moments, and a local truncation error of
O
(
τ
3
V
−
1
)
for the covariance. τ = V
^{−β
}, 0 < β < 1 and in the analysis
V
→
∞
, but in simulations the system volume is kept constant; thus it seems that in practice this also results in weak secondorder convergence in the covariance
20
.
Methods
The extrapolation framework
We start with an ODE solver with stepsize h approximating the true solution x(T) by
x
n
h
(where T = nh), for which we assume the following global error expansion exists:
x
(
T
)
−
x
n
h
=
e
k
(
T
)
h
k
+
e
k
+
1
(
T
)
h
k
+
1
+
e
k
+
2
(
T
)
h
k
+
2
+
…
,
where k is the order of the numerical method and the e
_{
k
}(T) are constants that only depend on the final integration time T. Extrapolating has the effect of cancelling the leading error term, resulting in a more accurate approximation. The existence of such an expansion is key to constructing a higherorder approximation, as the appropriate extrapolation coefficients must be used for the leading error terms to cancel. For example, in the case of a firstorder method with stepsize h,
x
(
T
)
−
x
n
h
=
e
1
(
T
)
h
+
e
2
(
T
)
h
2
+
O
(
h
3
)
,
and similarly for stepsize
h
2
,
x
(
T
)
−
x
2
n
h
/
2
=
e
1
(
T
)
h
2
+
e
2
(
T
)
h
2
4
+
O
(
h
3
)
.
Setting
x
̂
n
h
=
2
x
2
n
h
/
2
−
x
n
h
and using (3) and (2), we obtain
x
(
T
)
−
x
̂
n
h
=
−
h
2
2
e
2
(
t
)
+
O
(
h
3
)
,
which implies that
x
̂
n
h
is now a secondorder approximation to x(T).
We define
z
(
k
,
q
)
=
x
n
h
q
to be a series of approximations with order k and stepsize h
_{
q
}to the true solution x(T), where T = n
h
_{1}, h
_{
q
} = T / p
_{
q
} and h
_{1} > h
_{2} > … > h
_{
q
}. In general, one can use an order k method with stepsizes h
q
_{−1} and h
_{
q
} (in the previous example, h
q
_{−1} = hand
h
q
=
h
2
, i.e. p = 2), to arrive at an order k + 1 estimate to x(T),
x
(
T
)
−
z
(
k
,
h
q
−
1
)
=
O
(
h
k
+
1
)
,
where
z
(
k
,
h
q
−
1
)
=
p
k
z
(
k
,
q
−
1
)
−
z
(
k
,
q
)
p
k
−
1
.
This process can be repeated indefinitely. We can extrapolate from the initial approximations z(1,1),…,z(1,q) by combining the successive solutions in each column of the Romberg table:
z
(
1
,
1
)
z
(
2
,
1
)
z
(
1
,
2
)
z
(
3
,
1
)
z
(
2
,
2
)
⋮
z
(
q
,
q
)
z
(
1
,
3
)
⋮
z
(
3
,
q
)
⋮
z
(
2
,
q
)
z
(
1
,
q
)
For instance, in Eq. (4) we used (with p = 2)
x
n
h
=
z
(
1
,
1
)
and
x
2
n
h
/
2
=
z
(
1
,
2
)
to find
x
̂
n
h
=
z
(
2
,
1
)
. Repeating with
x
2
n
h
/
2
and
x
2
n
h
/
4
, we could extrapolate to find
x
̂
2
n
h
/
2
=
z
(
2
,
2
)
. Then we could extrapolate
x
̂
n
h
and
x
̂
2
n
h
/
2
to find a thirdorder approximation
x
̂
̂
n
h
=
z
(
3
,
1
)
, and so on.
Stochasticas

E
(
f
(
x
(
T
)
)
)
−
E
(
f
(
x
n
h
)
)

,
where
f
:
R
N
↦
R
is a suitable smooth functional, for example the first moment of one of the components of x.
x
n
h
is a numerical approximation to the SDE
d
X
t
=
a
(
X
t
,
t
)
dt
+
b
(
X
t
,
t
)
d
W
t
,
where
a
(
x
,
t
)
:
R
N
+
1
↦
R
N
b
(
x
,
t
)
:
R
N
+
1
×
M
↦
R
N
×
M
and W
_{
t
} is a standard Mdimensional Wiener increment. Talay and Tubaro
23
derived a similar expansion to Eq. (1) for the global error when
x
n
h
was calculated using the EulerMaruyama and Milstein schemes (outlined in Appendix A). By using this expansion and the extrapolation framework, they were able to derive a secondorder approximation to
E
(
f
(
x
(
t
)
)
)
. The crucial step in obtaining the global error expansion was to express it as a telescopic sum of the local errors. Liu and Li
30
also followed a similar procedure to derive a global error expansion for numerical methods for SDEs with Poisson jumps, thus allowing them to obtain higherorder weak approximations.
Extrapolation for discrete chemical kinetics
The extrapolation framework can be extended to the discrete stochastic regime. Since it requires two or more sets of approximations with given stepsizes (e.g. h and h/2), it can only be used with a fixedstep method: as more complex τleap methods vary τ at each step, it is not clear how to extrapolate them. However, this has the advantage of making our method very easy to program, as there is no complex programming overhead, for instance in choosing the timestep for τleap methods. We stress that we mostly use extrapolation to obtain higherorder approximations to the moments of the system (or their combinations, such as the covariance). In principle, given enough of the moments, the full probability distribution at some given time could be constructed. This is known as the Hamburger moment problem
31
and in general is a difficult problem to solve, as it might admit an infinite number of solutions. However, in some cases it is possible to reconstruct the full distribution from the extrapolated moments, as we have a priori knowledge about its shape. For instance, when the final distribution of states is known to be binomial, only the mean and variance are necessary for constructing the full extrapolated distribution (see Numerical Results, System 1).
In this section we focus on the Euler τleap method (ETL), since this choice simplifies the analysis, but we show in Appendix B that any fixedstepsize method with known weak order can be extrapolated. In our numerical investigations we show results for the ETL, the midpoint τleap (MPTL) and the θtrapezoidal τleap (TTTL) method. Extrapolating the ETL is very similar to extrapolating an ODE solver. The extrapolated ETL, which we call xETL from here on, involves running two sets of S ETL simulations for time T = nτ.
Algorithm 5. Extrapolated Euler τleap method (xETL)
1. Run S ETL simulations with stepsize τ, to get
s
x
n
τ
,
s
=
1
,
…
,
S
.
2. Calculate desired moments
E
S
(
f
(
x
n
τ
)
)
=
1
S
∑
s
=
1
S
f
(
s
x
n
τ
)
.
3. Repeat steps 1 and 2 using stepsize
E
S
(
f
(
x
2
n
τ
/
2
)
)
.
4. Take
2
E
S
(
f
(
x
2
n
τ
/
2
)
)
−
E
S
(
f
(
x
n
τ
)
)
as the extrapolated approximation to the desired moment.
Algorithm 5 can be easily modified for use with any other fixedstep method, by replacing the ETL in Step 1 with the chosen method.
It is instructive to use a simple example to see analytically the effects of extrapolating the ETL. When the propensity functions are linear (i.e. the system only contains zerothorder and firstorder reactions), the equations for the moments are closed
32
33
and we can find explicitly the global error expansion for the first moment of our numerical solution (i.e. choose f(x) = x). The propensity functions can be written as
[
a
1
(
x
)
,
a
2
(
x
)
,
…
,
a
M
(
x
)
]
T
=
C
x
+
d
,
where
C
∈
R
M
×
N
,
ν
∈
R
N
×
M
and
d
∈
R
M
, and we define W = νC, i.e.
W
∈
R
N
×
N
. Thus at some timestep m
mτ < T, the ETL gives (using matrix notation)
x
m
+
1
τ
=
x
m
τ
+
ν
P
(
τ
(
C
x
m
τ
+
d
)
)
,
where
x
m
τ
is an approximation to the true state vector x(t) at the mth timestep (i.e. time t) with stepsize τ. Taking the expectation of both sides, the ETL evolves the mean as
E
(
x
m
+
1
τ
)
=
E
(
x
m
τ
)
+
ν
E
(
P
(
τ
(
C
x
m
τ
+
d
)
)
)
=
E
(
x
m
τ
)
+
ν
E
(
E
(
P
(
τ
(
C
x
m
τ
+
d
)
)

x
m
τ
)
)
=
(
I
+
τW
)
E
(
x
m
τ
)
+
τν
d
,
where I is the N × N identity matrix. Note that we cannot evaluate the expectation of
P
(
τ
(
C
x
m
τ
+
d
)
)
directly: because
x
m
τ
is a random variable, we do not know the distribution of
P
(
x
m
τ
)
. Using the law of total expectation we can condition on
x
m
τ
taking a specific value;
P
(
x
m
τ
)

x
m
τ
does have a Poisson distribution. From (8), we see that at time T = nτ
E
(
x
n
τ
)
=
I
+
τnW
+
n
2
τ
2
W
2
+
…
μ
(
0
)
+
W
−
1
(
τnW
+
1
2
τ
2
n
2
W
2
+
…
)
ν
d
.
The probability density function of the SSA at time t is given by the CME
7
8
. The mean
μ
(
t
)
=
E
(
x
(
t
)
)
can be found from the CME
34
; it evolves as
d
μ
(
t
)
dt
=
W
μ
(
t
)
+
ν
d
.
The solution of this is
μ
(
t
)
=
e
Wt
μ
(
0
)
+
e
Wt
∫
0
t
e
−
Ws
ν
d
ds.
Using a Taylor expansion and basic manipulation, at t = Tthis evaluates to
μ
(
T
)
=
I
+
TW
+
1
2
T
2
W
2
+
…
μ
(
0
)
+
W
−
1
TW
+
1
2
T
2
W
2
+
…
ν
d
.
Taking (11) minus (9) we see that the global error is
μ
(
t
)
−
E
(
x
n
τ
)
=
1
2
τT
W
2
+
O
(
τ
2
)
μ
(
0
)
+
1
2
τTW
+
O
(
τ
2
)
ν
d
.
Furthermore the extrapolated error is
μ
(
T
)
−
2
E
(
x
2
n
τ
/
2
)
−
E
(
x
n
τ
)
=
1
6
τ
2
T
W
3
+
O
(
τ
3
)
μ
(
0
)
+
1
6
τ
2
T
W
2
+
O
3
ν
d
,
so the leading error term has been cancelled, leaving an order two approximation. Such a calculation would also apply for the MPTL. The difference is that for linear systems the MPTL is secondorder convergent with respect to the mean
16
, and similarly for the TTTL
20
. This should be taken into account in order to choose the correct extrapolation coefficients.
The above analysis only applies for the mean of a linear system, a very restricted case, but it is useful for demonstrating the basic principles of stochastic extrapolation. We employ a similar approach to Talay and Tubaro
23
and Liu and Li
30
to find a general expression for the global error expansion of the moments of a weak firstorder discrete stochastic method; this is Appendix B. In Appendix C we explicitly evaluate this for the particle decay system and show that it is equivalent to Eq. 12 in this case. Appendix D contains the equations for the second moment for the case of linear systems.
Multimodal systems
As discussed before, one limitation of our approach is that only specific characteristics of the particle distribution can be extrapolated, rather than the full distribution. Typically we choose these to be the first and second moments, as for many systems these are the quantities of interest. However, in some cases the moments do not take values relevant to the actual dynamics of the system
35
36
. This occurs, for instance, in bimodal or multimodal systems, which have two or more stable states. Nevertheless, our method can be easily generalised to accommodate multimodal distributions as follows.
Algorithm 6. Extrapolated Euler τleap method (xETL) for multimodal systems
1. Run S ETL simulations with stepsize τ, to get
s
x
n
τ
,
s
=
1
,
…
,
S
.
2. Plot histograms of the particle populations at time T and identify the stable states.
3. Choose point(s) at which to partition the S simulations into p subsets of S
_{1},…,S
_{p} simulations clustered around each stable state.
4. Calculate desired moments over the subsets of simulations,
E
S
i
(
f
(
x
n
τ
)
)
=
1
S
i
∑
s
=
1
S
i
f
(
s
x
n
τ
)
,
i
=
1
,
…
,
p
.
5. Repeat steps 1 and 4 using stepsize τ/2 to get
E
S
i
(
f
(
x
2
n
τ
/
2
)
)
,
i
=
1
,
…
,
p
.
6. Take
2
E
S
i
(
f
(
x
2
n
τ
/
2
)
)
−
E
S
i
(
f
(
x
n
τ
)
)
,
i
=
1
,
…
,
p
as the extrapolated approximation to the desired moment for each of the p subsets of simulations.
Algorithm 6 is also simple to code and does not require significant extra computational time compared to Algorithm 5 because the dynamics of the system are found from the original simulations that are necessary for the extrapolation anyway. The point(s) at which the simulations are split into subsets can affect the accuracy of the results, so must be chosen with some care. In the Numerical Results (System 5), we apply Algorithm 6 to a bimodal system, and investigate the effects of the choice of splitting point.
Results and discussion
Numerical results
We simulate some example systems for various stepsizes τ over time t = [0,T] using three fixedstep numerical methods: the Euler τleap (ETL), midpoint τleap (MPTL) and θtrapezoidal τleap (TTTL) methods (with θ = 0.55), and their extrapolated versions, the xETL, xMPTL and xTTTL. We plot the absolute weak errors in the mean and second moment, i.e.

E
(
x
(
T
)
)
−
E
S
(
x
n
τ
)

,

E
(
x
(
T
)
x
T
(
t
)
)
−
E
S
(
x
n
τ
(
x
n
τ
)
T
)

for the ETL, MPTL and TTTL methods and

E
(
x
(
T
)
)
−
2
E
S
(
x
2
n
τ
/
2
)
+
E
S
(
x
n
τ
)

,

E
(
x
(
T
)
x
T
(
t
)
)
−
2
E
S
(
x
2
n
τ
/
2
(
x
2
n
τ
/
2
)
T
)
+
E
S
(
x
n
τ
(
x
n
τ
)
T
)

for the extrapolated methods. Here x(T) is the analytical solution at time T and
E
S
(
f
(
x
n
τ
)
)
are the moments of its approximations given by S simulations of a fixedstep method with stepsize τ run for n steps. For the linear systems, the true solution is calculated analytically; for the nonlinear systems we use the value given by 10^{6} or 10^{7} repeats of the SSA (depending on the system). The error of a weak order α method with stepsize τ is approximately C
τ
^{
α
}, where C is an unknown constant. To easily see the order of accuracy of our results, we plot all the errors on loglog plots. Gradients are calculated using a least squares fit through the points. The highest level of Monte Carlo error, which can be calculated for the linear systems, is marked on the appropriate plots as a straight black line. Below this level, the absolute error results are, at least in part, essentially random (see Section Monte Carlo error). We note that in all test systems, the timesteps used were all in the useful τleaping regime: Poisson counts for each reaction channel varied between tens to hundreds.
System 1: Particle decay system
A simple test system is a particle decay,
X
→
k
∅
,
k
=
0
.
1
.
The initial particle number was X(0) = 10^{4} and the simulation time was T = 10.4. The units here, and in the systems below, are nondimensional. This system is useful only as a test problem, but is firstorder and easily tractable. The analytical mean and second moment are
E
(
X
(
t
)
)
=
X
(
0
)
e
−
kT
,
E
(
X
(
T
)
2
)
=
X
(
0
)
e
−
kT
−
X
(
0
)
e
−
2
kT
+
(
X
(
0
)
)
2
e
−
2
kT
.
The average final particle numbers, calculated as above, were
E
(
X
(
10
.
4
)
)
=
3534
.
5
. We ran 10^{8} simulations using timesteps τ = 0.05,…,0.8. The errors on the mean and second moment are shown in Figure
1. In both cases, the ETL gives firstorder errors and the xETL gives approximately secondorder errors. The MPTL and TTTL also converge with second order. The errors of the xMPTL and xTTTL are very small, although they do not converge with any noticeable order. This is because the values of the absolute error for these methods are effectively given by their Monte Carlo error, rather than the bias (this is a recurring theme in stochastic simulations  see Section Monte Carlo error). The maximum level of Monte Carlo error was 0.0148 for the mean and 104.8 for the second moment.
<p>Figure 1</p>System 1 absolute errors
System 1 absolute errors. Absolute error (13) in (a) the mean for the ETL (gradient 1.0) and xETL (gradient 1.9), MPTL (gradient 1.9) and xMPTL (gradient 0.5), TTTL (gradient 2.0) and xTTTL (gradient 1.5); (b) the second moment for ETL (gradient 1.0) and xETL (gradient 1.6), MPTL (gradient 1.9) and xMPTL (gradient 0.5), TTTL (gradient 2.0) and xTTTL (gradient 1.5). The maximum Monte Carlo error levels (black straight lines) were 0.0148 (mean) and 104.8 (second moment). Results from 10^{8} simulations, each run for T = 10.4.
In addition, because the final distribution of this system is known to be binomial
10
, we can construct the distribution of the extrapolated solutions from just the mean and variance (Figure
2). The dashed lines are the distributions of ETL simulations with τ = 0.05 (blue) and τ = 0.8 (red), calculated from their histograms, and the circles are the full distributions of the xETL using τ = 0.05 (blue) and τ = 0.8 (red). The solution of the CME (black line) is the true distribution. Both the extrapolated solutions match the CME solution very well, in both mean and overall shape.
<p>Figure 2</p>Constructing full extrapolated distributions of System 1
Constructing full extrapolated distributions of System 1. Distribution of states at time T = 10.4 for ETL with τ = 0.05 (blue dashes), ETL with τ = 0.8 (red dashes), xETL with τ = 0.05 (red circles), xETL with τ = 0.8 (blue circles). The analytical solution is given by the CME (black line). The extrapolated distributions match the CME solution very closely.
System 2: Chain decay system
This system consists of five chemical species, each undergoing a firstorder reaction. It forms a closed chain of linear reactions, and is intended as a more complicated, but still linear, example system.
X
1
→
k
1
X
2
→
k
2
X
3
→
k
3
X
4
→
k
4
X
5
→
k
5
X
1
,
k
j
=
0
.
3
,
j
=
1
,
…
,
5
.
The initial populations were X(0) = (2500,1875,1875,1875,1875^{)T
}and simulation time was T = 16. Since this system is linear, its expectation and covariance can be calculated analytically, as shown in Jahnke and Huisinga
32
; we used these to calculate the true second moment. The average final particle numbers, given by the analytical mean, were
E
(
X
(
16
)
)
=
(
1971
.
3
,
1996
.
7
,
2025
.
1
,
2020
.
9
,
1986
.
1
)
T
. We ran 10^{8} simulations with τ = 0.1,…,1.6. Figure
3 shows the absolute errors for the mean and second moment of X
_{1}. The ETL is again approximately weak order one and the xETL weak order two. Again, the errors for the MPTL, TTTL, xMPTL and xTTTL are very low, although again their order of convergence is not quite as high as expected. We believe this is due to the unusually high accuracy of these methods for closed systems compared to a maximum Monte Carlo error level of 0.0132 (mean) and 52.8 (second moment), which are high relative to the bias of these methods.
<p>Figure 3</p>System 2 absolute errors
System 2 absolute errors. Absolute error (13) of X_{1} in (a) the mean for ETL (gradient 1.3) and xETL (gradient 2.4), MPTL (gradient 1.2) and xMPTL (gradient 1.6), TTTL (gradient 1.1) and xTTTL (gradient 1.9); (b) the second moment for ETL (gradient 1.3) and xETL (gradient 2.4), MPTL (gradient 1.6) and xMPTL (gradient 1.7), TTTL (gradient 1.0) and xTTTL (gradient 2.0). The maximum Monte Carlo error levels (black straight lines) were 0.0132 (mean) and 52.8 (second moment). Results from 10^{8} simulations, each run for T = 16.
System 3: MichaelisMenten system
The MichaelisMenten is a common nonlinear test system, and represents an enzyme (X
_{2}) reacting with a substrate (X
_{1}) to make a product (X
_{4}). The enzyme and substrate form a complex (X
_{3}), which can either dissociate or undergo a reaction to a product plus the original enzyme. It has four chemical species in three reactions:
X
1
+
X
2
→
k
1
X
3
,
k
1
=
1
0
−
5
,
X
3
→
k
2
X
1
+
X
2
,
k
2
=
0
.
2
,
X
3
→
k
3
X
2
+
X
4
,
k
3
=
0
.
2
.
Simulation time was T = 16 and the initial populations were X(0) = (10^{4},2 × 10^{3},2 × 10^{4},0)^{
T
}. We used 10^{8}simulations with τ = 0.1,…,1.6. There is no analytical solution, so in this case we approximated it with 10^{7}SSA simulations. The average final state, given by the SSA, was
E
(
X
(
16
)
)
=
(
5927
.
0
,
18716
.
2
,
3283
.
8
,
20789
.
2
)
T
. The errors in the mean and second moment for X
_{1} are shown in Figure
4. The ETL converges with order one for 10^{8} simulations, and the xETL with order two. The MPTL and TTTL have a similar accuracy to the xETL, with approximate order two, with the xMPTL and xTTTL of approximate order three.
<p>Figure 4</p>System 3 absolute errors
System 3 absolute errors. Absolute error (13) of X_{1} in (a) the mean for ETL (gradient 1.0) and xETL (gradient 2.0), MPTL (gradient 1.6) and xMPTL (gradient 3.3), TTTL (gradient 1.6) and xTTTL (gradient 1.2); (b) the second moment for ETL (gradient 1.0) and xETL (gradient 2.0), MPTL (gradient 1.6) and xMPTL (gradient 3.3), TTTL (gradient 1.6) and xTTTL (gradient 1.2). Results from 10^{8} simulations, each run for T = 16.
We investigated the effects of the coefficient of variation (CV, standard deviation divided by mean) on this system. The CVs of each species, averaged across all τ, in System 3 at T = 16 were CV(X(16)) = (0.01,0.003,0.02,0.004)^{
T
}. In general, a higher CV indicates that the system is more noisy. We chose a new set of parameters for System 3 to give higher CVs:
X
new
(
0
)
=
(
100
,
50
,
200
,
0
)
T
,
k
new
=
(
1
0
−
4
,
0
.
05
,
0
.
07
)
T
. The CVs using these parameters were
CV
(
X
new
(
16
)
)
=
(
0
.
05
,
0
.
03
,
0
.
1
,
0
.
07
)
T
, very different from the original CVs. However the relative errors (absolute error divided by average SSA state) at τ = 0.1 were very similar:
error
rel
(
X
(
16
)
)
=
(
0
.
004
,
0
.
0005
,
0
.
003
,
0
.
001
)
T
for the original system and
error
rel
(
X
new
(
16
)
)
=
(
0
.
001
,
0
.
0002
,
0
.
0007
,
0
.
002
)
T
with the new parameters (note that it is not useful to average the errors across all τ). This shows that higher CV does not necessarily mean higher errors, and the two are indicators of different characteristics of the system. We have focused on the errors as this is the characteristic that we want to improve.
System 4: Twoenzyme mutual inhibition system
This is a more realistic system involving 8 chemical species and 12 reactions
37
. It represents two enzymes, E
_{
A
} and E
_{
B
}, which catalyse the production of compounds A and B, respectively. In a classic example of doublenegative feedback, each product inhibits the activity of the other enzyme. For this reason, the system is bistable in A and B: when there are many particles of A, few particles of B can be produced, and vice versa. The reactions are
E
A
→
k
1
E
A
+
A
,
k
1
=
15
,
E
B
→
k
2
E
B
+
B
,
k
2
=
15
,
E
A
+
B
⇌
k
4
k
3
E
A
B
,
k
3
=
5
×
1
0
−
4
,
k
4
=
2
,
E
A
B
+
B
⇌
k
6
k
5
E
A
B
2
,
k
5
=
1
0
−
3
,
k
6
=
6
,
A
→
k
7
∅
k
7
=
5
,
E
B
+
A
⇌
k
9
k
8
E
B
A
,
k
8
=
5
×
1
0
−
4
,
k
9
=
2
,
E
B
A
+
A
⇌
k
11
k
10
E
B
A
2
,
k
10
=
1
0
−
3
,
k
11
=
6
,
B
→
k
12
∅
,
k
12
=
5
.
We simulated this system for T = 3.2 using initial populations of
X
(
0
)
=
(
2
×
1
0
4
,
1
.
5
×
1
0
4
,
9500
,
9500
,
2000
,
500
,
2000
,
500
)
T
,
where
X
=
(
A
,
B
,
E
A
,
E
B
,
E
A
B
,
E
A
B
2
,
E
B
A
,
E
B
A
2
)
T
. We ran 10^{7} simulations of the τleap using τ = 0.005,…,0.08. 10^{6} SSA simulations were used as an approximation to the analytical values. The final state of the system as given by the SSA mean was
E
(
X
(
3
.
2
)
)
=
(
10420
.
5
,
4884.4,3594.7,1528.0,4592.7,3812.6,3853.8,6618.2^{)T
}. The errors for X
_{1} in this system are shown in Figure
5. The ETL and xETL gave errors of approximately order one and two, respectively. The MPTL and TTTL were again approximate order two; the xMPTL had errors very similar to those of the xETL, and the xTTTL had very low errors of approximate order three.
<p>Figure 5</p>System 4 absolute errors
System 4 absolute errors. Absolute error (13) of X_{1} in (a) the mean for ETL (gradient 1.3) and xETL (gradient 2.6), MPTL (gradient 1.9) and xMPTL (gradient 2.3), TTTL (gradient 1.7) and xTTTL (gradient 2.7); (b) the second moment for ETL (gradient 1.3) and xETL (gradient 2.6), MPTL (gradient 2.0) and xMPTL (gradient 2.3), TTTL (gradient 1.7) and xTTTL (gradient 2.9). Results from 10^{7} simulations, each run for T = 3.2.
System 5: Schlögl system
The last example we give illustrates how our method could work for systems that have multimodal distributions. The Schlögl system consists of four reactions
38
and is bimodal:
A
+
2
X
→
k
1
3
X
,
k
1
=
3
×
1
0
−
7
,
3
X
→
k
2
A
+
2
X
,
k
2
=
1
0
−
4
,
B
→
k
3
X
,
k
3
=
1
0
−
3
,
X
→
k
4
B
,
k
4
=
3
.
5
,
where the populations of species A and B are held constant at 10^{5} and 2 × 10^{5} respectively, so only the numbers of X can change. This is a common benchmark system for computational simulation algorithms. Under certain parameter configurations, this system has two stable states for X, one high and one low. When X(0) is low, the system usually settles in the lower equilibrium, and vice versa for high X(0). We used X(0) = 250, an intermediate value that gives a bimodal distribution. We ran 10^{8} simulations until T = 5 using τ = 0.0125,…,0.2. Because of the bimodality, we separated the data into two sets. As the Schlögl system is simple enough to be solved using a CME solver (see e.g.
39
), we could calculate the density of states of X at T = 5. When simulating more complicated systems, this can be approximated by a histogram of the distribution of species at T (from the initial ETL simulations, for instance) to get a reasonable idea of its shape. Average final particle number was calculated from the CME to be
E
(
X
(
5
)
)
=
101
.
3
for the low peak and
E
(
X
(
5
)
)
=
546
.
2
for the high peak. Figures
6 and
7 show the absolute errors for this system (low and high peaks, respectively). The error levels seem to be similar in both cases. The general trend was for the ETL, MPTL and TTTL to converge with approximate order one, whereas the extrapolated solutions did not have a clear order of convergence. It is clear that in this case also, the Monte Carlo error is interfering with our ability to see a clear order of convergence.
<p>Figure 6</p>System 5 low peak absolute errors
System 5 low peak absolute errors. Absolute error (13) of X_{1} in (a) the mean for ETL (gradient 0.9) and xETL (gradient 0.06), MPTL (gradient 0.8) and xMPTL (gradient 0.3), TTTL (gradient 0.4) and xTTTL (gradient 0.0); (b) the second moment for ETL (gradient 0.9) and xETL (gradient 0.1), MPTL (gradient 0.8) and xMPTL (gradient 0.6), TTTL (gradient 0.3) and xTTTL (gradient 0.0). Results from 10^{8} simulations, each run for T = 5.
<p>Figure 7</p>System 5 high peak absolute errors
System 5 high peak absolute errors. Absolute error (13) of X_{1} in (a) the mean for ETL (gradient 1.2) and xETL (gradient 0.9), MPTL (gradient 1.6) and xMPTL (gradient 1.9), TTTL (gradient 1.0) and xTTTL (gradient 0.4); (b) the second moment for ETL (gradient 1.2) and xETL (gradient 0.9), MPTL (gradient 1.3) and xMPTL (gradient 1.8), TTTL (gradient 1.2) and xTTTL (gradient 0.5). Results from 10^{8} simulations, each run for T = 5.
The gradient of the MPTL mean error seems to change from approximately one (Figure
6) to around 1.5 (Figure
7). It is unlikely that this is due to Monte Carlo error, as the error of the MPTL is high enough that this should not be an issue. In fact, this is probably due to the large volume limit behaviour of the MPTL, discussed after Algorithm 3. Because the mean of the high peak is several times higher than the mean of the low peak, the system is closer to the large volume limit and the weak order of the MPTL increases accordingly. Once in the large volume limit, the gradient is expected to be two.
The point at which the data is separated must be chosen carefully, as it can influence the error results. Figure
8 shows the distribution of X at T = 5 calculated using a CME solver. The three choices of splitting points that we compare have also been marked. We chose X = 300; this is a fairly obvious splitting point in our case. To support this, we tested the effects of splitting the data given by the ETL at X = 250 and X = 350. We calculated the relative change in mean, second moment and variance between these two splitting values, averaged over all simulations of the ETL and all five timesteps: Table
1 shows that in this case the percentage difference is relatively small for the mean, becomes larger for the second moment, and is very high for the variance. This implies that the choice of splitting point is very important in this case, and should be carefully considered.
<p>Figure 8</p>System 5 full distribution, T=5
System 5 full distribution, T = 5. The distribution of System 5 calculated using a CME solver for T = 5. The three splitting points we have compared in this paper are illustrated.
<p>Table 1</p>
Moment
Low peak
High peak
Relative differences in moments
E
(
f
(
X
)
)
for data split at X = 250 and X = 350 for the Schlögl system as percentages of their values when split at X = 300, i.e.
100
∗

E
(
f
(
X
)
)
split
350
−
E
(
f
(
X
)
)
split
250

/
E
(
f
(
X
)
)
split
300
.
Mean
6.4%
1.6%
Second moment
21.8%
2.5%
Variance
81.6%
50.5%
Relative differences in moments for different splitting values of Schlögl system
Higher extrapolation
In principle it is possible to use extrapolation in conjunction with some fixedstep method such as the ETL to obtain increasingly higherorder approximations using the appropriate extrapolation coefficients from the Romberg table. In practice, the usefulness of repeated extrapolations is debatable, as each adds extra computational overhead and the higher accuracy can be obscured by Monte Carlo fluctuations. However it might be useful to create up to third or fourthorder methods in this way. We tried doubleextrapolating the ETL on our test systems, with reasonable success. Figure
9 shows the ETL, xETL and doubleextrapolated Euler τleap (xxETL) errors on species X
_{1} of System 2; in this case we can see that the xxETL results are approximately order three (or higher). Doubleextrapolating gave substantially lower errors in almost every system, but it was not always easy to determine the order of convergence. A good example of this is System 3 in Figure
10. In such systems with relatively high Monte Carlo error, the approximate solutions from the xxETL were obscured by these fluctuations. The order of accuracy of the xxETL could be successfully seen for most molecular species of System 2, but it was not possible for the other systems as this would require a significant increase in the number of simulations, in order to further reduce the Monte Carlo error.
<p>Figure 9</p>System 2, doubleextrapolated absolute errors
System 2, doubleextrapolated absolute errors. Absolute error of X_{1} for the Euler τleap (ETL), extrapolated Euler τleap (xETL) and doubleextrapolated Euler τleap (xxETL) algorithm for (a) the mean, gradients 1.3 (ETL), 2.3 (xETL), 3.7 (xxETL), and (b) the second moment, gradients 1.3 (ETL), 2.4 (xETL), 3.8 (xxETL). Results from 10^{8} simulations. Doubleextrapolation produces consistently higherorder results.
<p>Figure 10</p>System 3, doubleextrapolated absolute errors
System 3, doubleextrapolated absolute errors. Absolute error of X_{1} for ETL, xETL and xxETL for (a) the mean, gradients 1.0 (ETL), 2.0 (xETL) and 2.3 (xxETL), and (b) the second moment, gradients 1.0 (ETL), 2.0 (xETL), 2.1 (xxETL). Results from 10^{8} simulations. In this case, the xxETL errors do not show a clear order of accuracy, but are nonetheless smaller compared to the other methods.
Monte Carlo error
The weak error we calculate numerically is

E
(
f
(
x
(
t
)
)
)
−
E
S
(
f
(
x
n
τ
)
)

.
This error can be separated into two parts
E
(
f
(
x
(
t
)
)
)
−
E
∞
(
f
(
x
n
τ
)
)
+
E
∞
(
f
(
x
n
τ
)
)
−
E
S
(
f
(
x
n
τ
)
)
,
where
E
∞
(
f
(
x
n
τ
)
)
are the theoretical values of the moments calculated by an infinite number of simulations with stepsize τ. The first term is the truncation error of the moments from their analytical solutions, i.e. the bias of the method, which depends only on the choice of timestep. The second term is the Monte Carlo error, which depends only on the number of simulations and is given by
C
S
, where C is some constant and S the number of simulations. The Monte Carlo error can be so large that it overwhelms the bias of the underlying numerical method completely; in this case all of the numerical results are, in effect, incorrect, as they are random fluctuations.
This formulation is useful when the propensity functions are linear. In this case, the moment equations are closed, so
E
∞
(
f
(
x
n
τ
)
)
can be calculated for the appropriate numerical method. As an example, consider the mean of the ETL: its true value is given by Eq. (10) and the value of its numerical approximation can be found by iterating Eq. (8). In addition, a similar calculation can be found in Appendix D for the second moment. Unfortunately, this is not possible for nonlinear systems, since in this case the equations describing the evolution of the moments are not closed any more
40
.
Such a breakdown of the errors of System 1 with the ETL method is shown in Figure
11, using results from 10^{8} simulations. We see that for the case of the ETL and xETL, the bias can easily be seen as the Monte Carlo errors are relatively low compared to the bias. However, when we extrapolate a second time, the bias of the resulting estimator is so low that Monte Carlo fluctuations completely obscure it, even with 10^{8} simulations. This gives a poor approximation for the total error. The only way to reduce Monte Carlo error is to run more simulations or use variance reduction methods. This is a good illustration of why it is important to perform a suitable number of simulations to get accurate estimates of the moments.
<p>Figure 11</p>Split errors of System 1 using ETL
Split errors of System 1 using ETL. Split errors for System 1 for (a) the mean and (b) the second moment. The bias can be easily seen with the Euler τleap (ETL) or only a single extrapolation (xETL), but is obscured when we extrapolate a second time (xxETL). Results are from 10^{8} simulations.
Variance reduction methods, which aim to decrease the Monte Carlo error, are another useful way of reducing computational time: because the Monte Carlo error is lower, less simulations need to be run for a given accuracy, saving time. This is an important topic in its own right and we do not address it in this paper; we refer the interested reader to e.g.
41
. It is an active research area: recently Anderson and Higham
42
were able to significantly reduce the overal computational cost associated with the stochastic simulation of chemical kinetics, by extending the idea of multi level Monte Carlo for SDEs
43
44
.
Conclusions
Throughout this paper, we have given reduced computational time as a motivation for using the extrapolation framework. As this is an important issue, we support here our claim that extrapolation speeds up simulations at a given error level. Although twice as many simulations must be run for a single extrapolation, the loss in computational time from more simulations is compensated for by the significant reduction in error. This is important, as slow runtime is often a limitation of stochastic methods. Total computational times and the corresponding errors in the mean (in brackets) of all three methods used in this paper and their extrapolated and doubleextrapolated versions are shown in Tables
2 and
3 for Systems 1 and 4 (10^{8} and 10^{7} simulations, respectively). The estimated time to run the same number of SSA simulations is given for comparison. It should be noted that all the extrapolated and doubleextrapolated τleap times are estimates: the timeconsuming part of the extrapolation method is the two (or more) sets of simulations of the original method that must be run; the extrapolation itself is a fast arithmetic operation. The times for a single extrapolation are calculated as runtime(τ = h) + runtime(τ = h/2), and for a doubleextrapolation as runtime(τ = h) + runtime(τ = h/2) + runtime(τ=h/4). The extrapolated methods take several times longer to run, but the errors they give are several to hundreds of times lower. Most of the exceptions are where the error has clearly reached the Monte Carlo level (Tables
2 and
3, entries marked (MC)). Besides extrapolation, the other obvious way to reduce error is to use a smaller timestep, so the real test for the effectiveness of extrapolation is to compare the runtimes of cases with similar error values. In Tables 2 and 3, we have marked in bold the extrapolated errors that can be directly compared to the base method (i.e. read up one row) and take less time to run, and similarly for doubleextrapolated errors as compared to singleextrapolated ones. Although this varies for each system and simulation method, the general trend is that extrapolation takes less time to give a similar level of error, and this pattern was similar for our other example systems. Thus we feel that in general, extrapolation is a worthwhile procedure.
<p>Table 2</p>
τ
0.05
0.1
0.2
0.4
0.8
Processing times (in thousands of seconds) of 10^{8} simulations of System 1 using Euler τleap, midpoint τleap and θtrapezoidal τleap methods and their extrapolated and doubleextrapolated versions. Absolute errors in the mean from the analytical values are included in brackets. For comparison, implementation of 10^{8} simulations of an SSA using the Direct Method would take approximately 1.2 × 10^{5}seconds. Entries marked (MC) are thought to be below the Monte Carlo error level; bold entries are faster than the un/singleextrapolated versions with similar error level (i.e. compare with one row above).
ETL
21.3 (9.2)
13.3 (18.4)
7.63 (37.1)
4.08 (74.7)
2.21 (152.0)
xETL

34.3 (0.05)
21 (0.16)
11.8 (0.61)
6.34 (2.6)
xxETL


(MC) 42.1 (0.015)
(MC) 25 (0.016)
14 (0.043)
MPTL
21.3 (0.024)
13.4 (0.070)
7.65 (0.25)
4.18 (1.0)
2.24 (4.2)
xMPTL

(MC) 34.7 (0.0088)
(MC) 21.1 (0.0082)
(MC) 11.8 (0.0026)
6.38 (0.041)
xxMPTL


(MC) 42.4 (0.0089)
(MC) 25.2 (0.0090)
(MC) 14 (0.0088)
TTTL
40.8 (0.017)
21.1 (0.063)
13.3 (0.26)
7.65 (1.1)
4.17 (4.2)
xTTTL

(MC) 62.1 (0.0016)
(MC) 34.6 (0.0022)
(MC) 20.9 (0.0044)
11.8 (0.041)
xxTTTL


(MC) 75.4 (0.0022)
(MC) 42.2 (0.0031)
(MC) 25.1 (0.011)
Processing times of System 1
<p>Table 3</p>
τ
0.005
0.010
0.020
0.040
0.08
Processing times (in thousands of seconds) of 10^{7} simulations of System 4 using Euler τleap, midpoint τleap and ??trapezoidal τleap methods and their extrapolated and doubleextrapolated versions. Absolute errors in the mean of X
_{1}from the SSA values are included in brackets. Running 10^{7} simulations of an SSA using the Direct Method would take approximately 1.47 × 10^{7} seconds. Entries marked (MC) are thought to be below the Monte Carlo error level; bold entries are faster than the un/singleextrapolated versions with similar error level (i.e. compare with one row above).
ETL
191 (16.7)
82.3 (34.5)
46 (76.1)
32.6 (188.5)
19.7 (630.1)
xETL

293 (1.1)
123 (7.2)
66.4 (36.3)
35.1 (253.1)
xxETL


279 (0.93)
152 (2.5)
80.1 (36.0)
MPTL
165 (1.3)
84 (3.6)
44.9 (15.0)
24.1 (65.5)
12.3 (257.1)
xMPTL

244 (1.0)
129 (7.7)
69.1 (35.5)
36.3 (126.2)
xxMPTL


(MC) 289 (1.2)
(MC) 153 (1.5)
81.3 (5.3)
TTTL
277 (1.2)
155 (3.5)
82.4 (13.5)
44.6 (58.1)
23.3 (90.0)
xTTTL

(MC) 430 (0.41)
(MC) 239 (0.15)
128 (1.4)
68 (107.5)
xxTTTL


(MC) 515 (0.45)
(MC) 283 (0.37)
(MC) 151 (16.9)
Processing times of System 4
Monte Carlo error is an unavoidable problem when using stochastic simulations. The statistical fluctuations inherent in stochastic systems can obscure the bias error (i.e. order of convergence) of the numerical method if their size relative to the bias is large, as the total error is made up of these two contributions. A large number S of simulations must be run, as the Monte Carlo error scales as
1
/
S
. This error varies for each system. Figures
9 and
10 (and 11) show this clearly: both of the xxETLs have total errors of similar size for the same τ, but System 2 has relatively low Monte Carlo error, allowing us to see the bias of that system. However, we believe that System 3 has relatively high Monte Carlo error compared to its bias, implying that the xxETL errors we see in the figure are all due to statistical fluctuations. It should be noted that this seems to happen for all five test problems we use. The reason for this is that the extrapolated methods (and even the MPTL and TTTL, in some cases) have very high accuracy (i.e. low bias error). Since it is only possible to run a limited number of simulations, when the bias is very small, the total error will be given almost completely by the contribution from the Monte Carlo error.
A contrasting approach to reducing numerical errors is the multilevel Monte Carlo method. Originally developed for SDEs
43
44
, it has recently been extended to discrete chemical kinetics
42
. By considering a formulation of the total error similar to Eq. (15), the multilevel MonteCarlo method aims to reduce it by decreasing the Monte Carlo error. Here also many approximate solutions are generated with a variety of different timesteps. By intelligently combining many coarsegrained simulations with few finegrained ones, it is possible to find a similar level of accuracy to just using finegrained simulations. In contrast, extrapolation uses the same number of coarse and finescale solutions and gives results which are more accurate than the finescale solution, by reducing the bias instead of the Monte Carlo error. In cases where the bias is obscured by statistical errors, using a combination of both extrapolation and the multilevel Monte Carlo method would be ideal, as it would reduce both sources of error. This is an interesting research question and we plan to address it in the future.
In this work, we have extended the extrapolation framework, which can increase the weak order of accuracy of existing numerical methods, to the discrete stochastic regime. To demonstrate the concept, we have applied it to three fixedstep methods, the Euler, midpoint and θtrapezoidal τleap methods. Thus we have demonstrated numerically the effectiveness of extrapolation on a range of discrete stochastic numerical methods with different orders of accuracy for a variety of problems. The main requirement to use extrapolation with a numerical method is the existence of an expression for the global error that relates the error to the stepsize of the method. Analytically, this is all that must be found to show higher weak order convergence of the extrapolated method. To extrapolate once, only the leading error term need be known; further extrapolation requires knowledge of higher terms. We have found the form of the global weak error for a general weak order one method; the procedure is similar for higherorder methods. This is the real power of our approach: it can be applied to any fixedstep numerical method. Moreover, further extrapolations can raise the order of accuracy of the method indefinitely (although beyond a certain point the lower errors will be overtaken by Monte Carlo errors). We expect our method to be useful for more complex biochemical systems, for instance where frequent reactions must be simulated fast but accuracy is still important.
Appendices
SDE extrapolation
We summarise below the work of Talay and Tubaro
23
on extrapolating SDEs. The global weak error of a stochastic numerical scheme is
error
(
T
,
h
)
=
E
(
f
(
x
(
t
)
)
)
−
E
(
f
(
x
n
h
)
)
.
To extrapolate this scheme, we must obtain an expression of the form (1). We start with the Kolmogorov backward equations,
u
t
+
ℒ
u
=
0
,
u
(
x
,
t
)
=
f
(
x
)
,
where
u
t
=
∂u
∂t
f is a homogeneous function on
[
0
,
T
]
×
R
N
→
R
and
u
(
x
,
t
)
=
E
(
f
(
x
(
t
)
)

x
(
t
)
=
x
)
. x are the solutions of the SDE (6) with initial conditions x(0) = x
_{0} and
ℒ
is the generator of (6),
ℒ
≡
a
(
x
,
t
)
·
∇
x
+
1
2
b
(
x
,
t
)
b
T
(
x
,
t
)
:
∇
x
∇
x
,
with
A
:
B
=
∑
i
,
j
A
ij
B
ij
. Crucially, (16) can then be written as
error
(
T
,
h
)
=
E
u
(
x
0
,
0
)
−
E
u
(
x
n
h
,
T
)
.
The first term is known; the second can be found by recursively calculating the error between
u
when it is evaluated from adjacent timesteps:
E
u
(
x
n
h
,
T
)
,=
E
u
(
x
0
,
0
)
+
h
2
∑
k
=
0
n
−
1
E
ψ
(
x
k
h
,
kh
)
+
h
2
K
n
h
,
where ψ(x
t) is an error function that is wellbehaved and specific to each numerical method and
K
n
h
is a constant of unit order. For the case of the EulerMaruyama and Milstein methods, the second term is
O
(
h
)
, so their global weak error is
23
error
(
T
,
h
)
=
−
h
∫
0
T
ψ
E
(
x
(
s
)
,
t
)
ds
+
O
(
h
2
)
.
Eq. (17) shows that the EulerMaruyama and Milstein methods have global weak order one. It is easy to see that extrapolating them leads to solutions with global weak order two.
Discrete stochastic global error expansion
Here we derive a global error expansion similar to Eqs. (1) and (17) for the numerical approximation of moments by a weak firstorder discrete stochastic method, such as the Euler or midpoint τleap methods. Once the form of the error expansion is known, it is clear what extrapolation coefficients to use and extrapolation is simple. Similarly to the case of SDEs, we start with the Kolmogorov backward equations
∂
t
u
+
ℒ
u
=
0
in
Z
+
N
×
[
0
,
T
)
,
u
(
x
,
t
)
=
f
(
x
)
on
Z
+
N
×
{
T
}
.
Here
ℒu
≡
∑
j
=
1
M
a
j
(
x
)
u
(
x
+
ν
j
)
−
u
(
x
)
and
u
(
x
,
t
)
,=
E
f
(
X
T
)

X
t
=
x
,
with X
_{
t
} = X(t). Using semigroup notation it is possible to formally denote the solution of (18) as
u
(
x
,
t
)
,=
e
(
T
−
t
)
ℒ
f
(
x
)
,
which can be useful for quantifying the local error of a numerical approximation. By applying a stochastic Taylor expansion
45
to jump processes
16
, the onestep expansion for the expectation of f calculated with a firstorder numerical method should satisfy
E
(
f
(
X
h
h
)

X
0
h
=
x
)
=
A
h
f
(
x
)
=
f
(
x
)
+
h
ℒf
(
x
)
+
∑
j
=
2
∞
h
j
j
!
A
j
f
(
x
)
,
where
A
j
are difference operators associated with the numerical method in hand. Furthermore we have for the true onestep expansion
E
(
f
(
X
h
)

X
0
h
=
x
)
=
e
ℒh
f
(
x
)
=
f
(
x
)
+
h
ℒf
(
x
)
+
∑
j
=
2
∞
h
j
j
!
ℒ
j
f
(
x
)
.
Remark 1
An important element in our derivation of a global error expansion relates to the boundedness of u(x
t), and its discrete derivative. This boundedess is guaranteed
14
when the number of molecules in the chemical system is conserved or decreases with time. Proving this in the general case where zerothorder reactions can add molecules to the system is a nontrivial task. One way around this problem is to set the propensity functions a
_{
j
}(x) to zero outside a large but finite domain
14
; this is the approach we follow here.
Now if we let e(f,T,h) define the global error in the numerical approximation of
E
(
f
(
X
T
)
)
at time T = nhby a firstorder numerical method with timestep h we see that
e
(
f
,
T
,
h
)
=
E
(
f
(
X
T
)

X
0
=
x
)
−
E
(
f
(
X
T
h
)

X
0
h
=
x
)
=
u
(
x
,
0
)
−
E
(
u
(
X
T
h
,
t
)
,
X
0
h
=
x
)
=
∑
i
=
1
N
E
(
u
(
X
(
i
−
1
)
h
h
,
(
i
−
1
)
h
)
)
−
E
(
u
(
X
ih
h
,
ih
)
)
=
∑
i
=
1
N
E
(
u
(
X
ih
∗
,
ih
)
)
−
E
(
u
(
X
ih
h
,
ih
)
)
,
where we have used the result in Figure
12, and for notational simplicity omitted the dependence of the expectations on the initial conditions. If we take
g
i
(
y
)
=
u
(
y
,
ih
)
=
e
ℒ
(
T
−
ih
)
f
(
y
)
, then
E
(
u
(
X
ih
∗
,
ih
)
)
−
E
(
u
(
X
ih
h
,
ih
)
)
=
E
(
g
i
(
X
ih
∗
)
)
−
E
(
g
i
(
X
ih
h
)
)
.
Applying Eqs. (19) and (20) to g
_{
i
},
e
(
f
,
T
,
h
)
=
∑
i
=
1
N
E
h
2
2
(
ℒ
2
−
A
2
)
g
i
(
X
(
i
−
1
)
h
h
)
+
h
3
R
i
h
=
∑
i
=
1
N
E
h
2
2
(
ℒ
2
−
A
2
)
e
−
h
ℒ
g
i
−
1
(
X
(
i
−
1
)
h
h
)
+
h
3
E
(
R
i
h
)
=
∑
i
=
1
N
h
2
2
E
(
ℒ
2
−
A
2
)
e
−
h
ℒ
u
(
X
(
i
−
1
)
h
)
h
,
(
i
−
1
)
h
)
+
h
3
E
(
R
i
h
)
,
<p>Figure 12</p>Local error
Local error. Representation of the local error in terms of the function
u
(
x
,
t
)
,=
E
(
f
(
X
T
t
,
x
)
)
.
where we have used the fact that
g
i
=
e
−
h
ℒ
g
i
−
1
. We thus obtain
e
(
f
,
T
,
h
)
=
∑
i
=
1
N
h
2
E
(
ψ
e
(
(
X
(
i
−
1
)
h
h
,
(
i
−
1
)
h
)
)
)
+
h
3
E
(
R
~
i
h
)
,
where
ψ
e
(
x
,
t
)
,=
1
2
ℒ
2
−
A
2
u
(
x
,
t
)
,,
and
E
(
R
~
i
h
)
≤
C
(
t
)
,,
from our assumptions on the boundedness of u(x,t). Furthemore it is easy to see that
∑
i
=
1
N
h
E

ψ
e
(
(
X
(
i
−
1
)
h
h
,
(
i
−
1
)
h
)

≤
C
(
T
)
.
Using this and results from Talay and Tubaro
23
we find that
e
(
f
,
T
,
h
)
=
−
h
∫
0
T
E
(
ψ
e
(
X
s
,
s
)
)
ds
+
O
(
h
2
)
,
whereψ
_{
e
}(x
t) is dependent on the numerical method and given by (22).
An example explicit calculation of the global error expansion for a linear system
In Eq. (12) we calculated the global error on the mean for a linear system using the equations for its true and numerical solutions. We show how this approach connects to the general formulation of the errors given by Eq. (21) by using System 1 as an example. We choose this example as it is onedimensional and linear and is simple enough that the global error Eq. (12) can be calculated explicitly from the general formulation. We define the following notation for the discrete difference:
f
ν
=
f
(
x
+
ν
)
−
f
(
x
)
,
where we have assumed that we have only one chemical species subject to only one chemical reaction. For this particular case,
ℒ
2
f
=
a
2
f
νν
+
a
2
a
ν
f
νν
+
a
a
ν
f
ν
,
while for the ETL
A
2
f
=
a
2
f
νν
,
and thus
ψ
e
(
x
,
t
)
=
1
2
(
a
2
(
x
)
(
a
(
x
+
ν
)
−
a
(
x
)
)
(
u
(
x
+
2
ν
,
t
)
−
2
u
(
x
+
ν
,
t
)
+
u
(
x
,
t
)
)
+
1
2
a
(
x
)
(
a
(
x
+
ν
)
−
a
(
x
)
)
(
u
(
x
+
ν
,
t
)
−
u
(
x
,
t
)
)
.
For particle decay, ν = −1 and a(x) = κx. For g(x) = x, the solution to (18) is
u
(
x
,
t
)
=
x
e
−
κ
(
T
−
t
)
,
so (23) becomes
ψ
e
(
x
,
t
)
=
1
2
κ
2
x
e
−
κ
(
T
−
t
)
.
Thus the global error given by the general formulation is
∑
i
=
1
N
h
2
E
(
ψ
e
(
(
X
(
i
−
1
)
h
h
,
(
i
−
1
)
h
)
)
)
=
κ
2
h
2
2
∑
i
=
1
N
E
(
X
(
i
−
1
)
h
)
h
)
e
−
κ
(
T
−
(
i
−
1
)
h
)
=
κ
2
h
2
2
∑
i
=
1
N
(
1
−
κh
)
e
κh
i
−
1
e
−
κT
E
(
X
0
h
)
=
κ
2
Th
2
(
1
−
κh
)
e
κh
N
−
1
(
1
−
κh
)
e
κh
−
1
e
−
κT
N
E
(
X
0
h
)
=
κ
2
Th
2
E
(
X
0
h
)
+
O
(
h
2
)
,
which agrees exactly with what one obtains by calculating Eq. (12) for System 1 (i.e. setting d=0,W=−κ).
Formulae for the second moment of the Euler τleap in the case of linear systems
We can write the formulae for the numerical and analytical evaluation of the second moment of the ETL. This is useful for finding the relative contributions of the bias and Monte Carlo errors, as well as for explicitly calculating the local errors. The ETL evolves the second moment in time as
E
x
m
+
1
τ
(
x
m
+
1
τ
)
T
=
E
x
m
τ
+
ν
P
(
τ
(
C
x
m
τ
+
d
)
)
x
m
τ
+
ν
P
(
τ
(
C
x
m
τ
+
d
)
)
T
.
Using the law of total expectation and writing
S
m
τ
=
E
(
x
m
τ
(
x
m
τ
)
T
)
,
B
m
τ
=
diag
(
C
E
(
x
m
τ
)
)
)
and
D
=diag(d), we obtain
S
m
+
1
τ
=
S
m
τ
+
τW
S
m
τ
+
τ
S
m
τ
W
T
+
τ
μ
m
τ
d
T
ν
T
+
τν
d
(
μ
m
τ
)
T
+
τ
2
W
S
m
τ
W
T
+
τ
2
W
μ
m
τ
d
T
ν
T
+
τ
2
ν
d
(
μ
m
τ
)
T
W
T
+
τ
2
ν
d
d
T
ν
T
+
τν
B
m
τ
ν
T
+
τνD
ν
T
.
We can now iterate this formula in order to obtain the numerical approximation for the second moment of the ETL at any timestep.
The behaviour of the second moment in time as given by the CME,
S
(
t
)
=
E
(
x
(
t
)
x
T
(
t
)
)
, is
33
40
dS
(
t
)
dt
=
WS
(
t
)
+
S
(
t
)
W
T
+
μ
(
t
)
d
T
ν
T
+
ν
d
μ
T
(
t
)
+
νB
(
t
)
ν
T
+
νD
ν
T
,
where B(t)=diag(C
μ
(t)). By solving Eq. (26) and iterating Eq. (25), we can use the formulation of Eq. (15) to quantify the bias and Monte Carlo errors of the ETL. A similar approach can also be used for the MPTL and TTTL.
List of abbreviations
ODE;Ordinary differential equation, SDE;Stochastic differential equation, CME;Chemical Master Equation, SSA;Stochastic simulation algorithm, ETL;Euler τleap, xETL;Extrapolated Euler τleap, xxETL;Doubleextrapolated Euler τleap, MPTL, xMPTL, xxMPTL; Midpoint τleap and extrapolated and doubleextrapolated versions,TTTL, xTTTL, xxTTTL;Thetatrapezoidal τleap and extrapolated and doubleextrapolated versions, CV;Coefficient of variation
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
The main ideas were jointly discussed and developed by all authors, and all authors wrote the manuscript. The global error expansion was mainly calculated by KCZ. TSz performed the simulations and analysed them. All authors have read and approved the final version of the manuscript.