### Abstract

#### Background

When mathematical modelling is applied to many different application areas, a common task is the estimation of states and parameters based on measurements. With this kind of inference making, uncertainties in the time when the measurements have been taken are often neglected, but especially in applications taken from the life sciences, this kind of errors can considerably influence the estimation results. As an example in the context of personalized medicine, the model-based assessment of the effectiveness of drugs is becoming to play an important role. Systems biology may help here by providing good pharmacokinetic and pharmacodynamic (PK/PD) models. Inference on these systems based on data gained from clinical studies with several patient groups becomes a major challenge. Particle filters are a promising approach to tackle these difficulties but are by itself not ready to handle uncertainties in measurement times.

#### Results

In this article, we describe a variant of the standard particle filter (PF) algorithm which allows state and parameter estimation with the inclusion of measurement time uncertainties (MTU). The modified particle filter, which we call MTU-PF, also allows the application of an adaptive stepsize choice in the time-continuous case to avoid degeneracy problems. The modification is based on the model assumption of uncertain measurement times. While the assumption of randomness in the measurements themselves is common, the corresponding measurement times are generally taken as deterministic and exactly known. Especially in cases where the data are gained from measurements on blood or tissue samples, a relatively high uncertainty in the true measurement time seems to be a natural assumption. Our method is appropriate in cases where relatively few data are used from a relatively large number of groups or individuals, which introduce mixed effects in the model. This is a typical setting of clinical studies. We demonstrate the method on a small artificial example and apply it to a mixed effects model of plasma-leucine kinetics with data from a clinical study which included 34 patients.

#### Conclusions

Comparisons of our MTU-PF with the standard PF and with an alternative Maximum Likelihood estimation method on the small artificial example clearly show that the MTU-PF obtains better estimations. Considering the application to the data from the clinical study, the MTU-PF shows a similar performance with respect to the quality of estimated parameters compared with the standard particle filter, but besides that, the MTU algorithm shows to be less prone to degeneration than the standard particle filter.

##### Keywords:

Particle filter; Sequential Monte Carlo methods; Nonlinear filtering; Parameter estimation; Measurement time uncertainties; PK/PD; Mixed effects; Leucine kinetics### Background

#### Measurement time uncertainties

Uncertainty in the time at which a measurement is taken is an often neglected source of random error. While in many application areas, this kind of error is generally small and indeed neglectable (due to automated measurements and precise timings), in others it may be of real influence, especially in the life sciences. As a prominent example, one may consider pharmacokinetic and pharmacodynamic (PK/PD) models which are used to describe the metabolic interactions and the effects of a chemical agent (like a drug or a labelled substance) over time inside an organism, respectively.

A typical population experiment in the PK/PD context consists in the analysis of the contents of the blood plasma of several individuals with respect to concentrations of certain molecules of interest. For this purpose, blood probes have to be taken from each individual at certain (fixed) time points after a certain event has occurred (e.g. a drug or a labelled substance has been applied). It is clear from the setting of the experiments that there is some variation in the real point in time when the blood probe has been taken: the true time when the measurement value has been obtained might be shortly before or after the intended time, and this true measurement time is not known to us. Since the inclusion of those time uncertainties in the model usually makes the analysis more difficult, it is standard to lump the time uncertainties with the measurement error. But especially at early times when concentrations change quickly, this may easily lead to wrong estimations, even if one assumes very high variances of the measurement error (we will demonstrate this later on a simple example). On the other hand, the inclusion of measurement time uncertainties (MTU) in algorithms aiming at inference making in complex models is not straightforward. In this article, we will present a modification of the Particle Filter (PF) algorithm (which we call MTU-PF) which is able to fully include a statistical model of the time uncertainties.

#### Inference in complex systems

The assessment of the effectiveness of a drug in a clinical study has been done in the past by the direct computation of relatively simple statistical values. The enormous increase in complexity of the underlying models, due to present developments in medicine and biology, for instance in the areas of personalized medicine or systems biology, increases also the need for more sophisticated model-based inference methods.

The estimation of unobservable internal variables or model parameters from data which have been obtained from blood or tissue samples at several time points can reveal information on the concentrations and effectiveness of the substance under question. If these data come from individuals which belong to two different (or even more) groups, e.g. test and control group, mixed effects are introduced in the underlying models. The inherent non-linearity and high variability of biological processes adds considerably to the difficulties one faces during the inference step. Inference in connection with dynamic models plays a major role in many other application areas. State and parameter estimation as well as model discrimination and validation are most common, but also optimal control problems should be mentioned.

It is often not enough to consider (independent) measurement noise [1]. Correlations between residuals are not uncommon, and the violation of this statistical assumption may lead to wrong estimates. A natural way to include correlated noise is to model two different types of noise: the dynamic (process or system) noise which is present in the dynamics of the system states and originates either from true random fluctuations in the system or from unmodelled dynamics in the system, and the measurement noise which is introduced by the measurement procedure or equipment and modelled by independent residuals. One possible approach is to use state space models which consist of a time-continuous model for the system states, e.g. based on Stochastic Differential Equations (SDEs), and a separate model for the time-discrete measurements.

#### Parameter estimation with Maximum Likelihood approach

Parameter estimation in state space systems is a difficult problem. In a context where the system dynamics are modelled by Ordinary Differential Equations (ODEs) without correlated noise, the problem is most often considered as a (deterministic) optimization problem based on a Maximum Likelihood (ML) formulation. An overview of these approaches can be found in [2] and [3]; see also [4], which consider also other aspects like identifiability. A generalization of the ML approach including more flexible cost functions is given by the prediction error estimation method ([5]). The introduction of system noise in the state variables leads to optimization problems with SDE constraints. In this case, internal system states which cannot be directly observed need to be estimated jointly with the parameters, given the data. For this purpose, the parameter estimation methods must be augmented by appropriate state filtering methods. An overview of ML parameter estimation in these types of models is given in [6]. If the SDEs are non-linear, linearizations to the Kalman Filter, like the Extended Kalman Filter (EKF) or the Unscented Kalman Filter (UKF), are used to establish approximations to the means and covariances of the filter distributions over time. All those approximations suffer from the fact that they approximate the filtering distributions of the states by a Gaussian distribution at all time points and cannot adequately approximate skewed or multimodal distributions. Better approximations are provided by simulation based methods like Sequential Monte Carlo (SMC) algorithms where good convergence results have been established ([7]). Nevertheless, they suffer from several drawbacks when applied to the joint estimation of dynamic states and fixed parameters ([8-10], see also [11]).

#### Parameter estimation in a Bayesian context

In a Bayesian context, in contrast to the “classical” ML approach, a prior distribution is assigned to the parameter vector, hence the parameters can be treated as random variables. In this sense, parameter estimation is done by evaluating the so-called posterior distribution which can be computed (at least theoretically) by Bayes’ theorem given the observations (measurements) and the prior distribution. In the context of high-dimensional spaces, this requires the computation of high-dimensional integrals which is not possible to do analytically. For this purpose, Markov Chain Monte Carlo (MCMC) methods provide powerful tools for the computation of simulation-based approximations to the posterior distribution. Again, in the context of the joint estimation of dynamic states and fixed parameters, the design of good proposal densities is a very difficult problem which renders the use of standard MCMC methods like the Metropolis-Hastings sampler impractical for the purposes of parameter estimation in state space systems.

It has long been a wish to combine both (dynamic) SMC and (static) MCMC methods to provide a general tool for the joint estimation of dynamic states and static parameters. Only recently, Andrieu et al. [11] proposed a very promising combination of both types of Monte Carlo approaches called Particle Markov Chain Monte Carlo (PMCMC) which is generally applicable and where also convergence has been proved.

In the present article, even though the PMCMC approach might be the preferred method for parameter estimation in state space systems, we will concentrate solely on the SMC methods, since our modification affects only this part. However, to be able to do parameter estimation in a pure SMC context, we rely on an approach that is very often used to avoid problems with the estimation of constant parameters. This approach consists in the introduction of artificial dynamics in the parameters, that means the parameters are allowed to slightly change their values over time. In this way, and in a Bayesian context, the parameters can be treated exactly in the same way as the system states. After building an augmented system state by concatenating the parameter vector and the state vector, the joint estimation of states and parameters reduces to filtering of the augmented state vector which makes SMC methods directly applicable to the problem.

#### Particle filters for state and parameter estimation

Particle filters ([12-14]) belong to the class of SMC methods for state filtering in state space models. Using the state augmentation approach, the method is also capable of estimating system parameters. The standard particle filter is designed for discrete, non-linear, and non-Gaussian models and can routinely be adapted to the continuous case with measurements at discrete times. The idea of the particle filter is that, at each time point, there is a sample based representation (the weighted particles) of the current estimate of the inner states and parameters which is based on the measurements that have been obtained up to the current time point. The particle cloud is then propagated through time, and the particles and weights are updated accordingly at each time point where measurements are available.

#### Non-Linear Mixed Effects models

Estimation in a Non-linear Mixed Effects model (NLME) involves the estimation of both global and individual parameters. With classical maximum likelihood estimation, the individual parameters are random variables equipped with a distribution while the global parameters remain constants with a “true” but unknown value. If the underlying model equations are non-linear, this leads to likelihood functions which are not analytically accessible and one has to rely on approximations. In the context where the system dynamics are modelled by ODEs, the most popular algorithm for NLME parameter estimation in the PK/PD context is the tool NONMEM ([15]). In [1] an estimation algorithm for NLME models based on Stochastic Differential Equations (SDEs) was proposed that uses the First-Order Conditional Estimation (FOCE) method to approximate the likelihood in combination with the EKF estimation in the SDEs. This has been added to NONMEM ([16]). In [17], a comparison between ODE and SDE based parameter estimation has been performed which showed that the interindividual variabilities were in general estimated to be smaller for the SDE model. Donnet and Samson ([18]) proposed a stochastic version of the Expectation-Maximization (SAEM) algorithm (for the estimation of the global parameters) in combination with MCMC methods (for the estimation of states and individual parameters). However, since MCMC exhibits slow mixing properties in the context of the estimation of states and parameters in state space models, in [19] MCMC has been replaced by the more promising PMCMC approach of Andrieu et al. ([11]).

On the other hand, in a Bayesian context, also the global parameters are equipped with a (prior) probability distribution, and the conceptual difference between global and individual parameters vanishes. The mixed effects model can then be considered as a hierarchical model with dependent parameters ([20,21], see also [22] for a more recent population-based Bayesian approach to PK/PD modelling). Simulation-based (Monte Carlo) methods can easily be adapted to this case. Nevertheless, the above mentioned challenges to both SMC and MCMC methods are even higher due to the increased number of states and parameters in NLME models (the number of states and individual parameters has to be multiplied by the number of individuals).

#### Aim of the article

Our goal is two-fold: Firstly, we want to show that the particle filter algorithm is applicable (with our modifications) also to more complex models when time uncertainties are formulated explicitly. Secondly, we want to show that the modification may even provide the possibility for further enhancement of the performance of the algorithm by presenting an adaptive time-stepping scheme which is only possible in the context of the new algorithm.

We do not claim that our MTU algorithm generally performs better or worse than the standard filter, nor that it should be the preferred method for estimation in non-linear mixed effects models. Rather, we provide a method which is usable for models where time uncertainties may play a major role. In these cases, it may indeed lead to better estimations. On the other hand, our method transfers the time-discrete particle filter approach, where updates based on the measurements very strictly depend on the measurement times, to a truly time-continuous approach, where updates to the filtering distributions can be performed at every point on the time-scale. Since we want to focus on the time uncertainties, we neglect discussing further issues like identifiability, model evaluation and model discrimination. In our application to the model of plasma-leucine kinetics, we try to avoid these issues by providing ad-hoc values to some of the parameters (especially to the variances of the system states).

#### Motivating example

Let us have a look at an example for illustrating the benefits of a separate modelling of measurement time uncertainties. Let us consider a state space system given by the ODE

with parameters
*t*. We call the state trajectories obtained by this deterministic system the nominal
evolutions of the states. We add noise to the system in a standard way by introducing
an additional term
*q* over time:

Furthermore, let the initial state *q*(0) be given by a log-normal distribution with parameters
*q*(0), respectively). The parameters chosen in our implementation of this example are
shown in Table 1.

**Table 1.** **Parameters for the motivating example**

We assume that *M* measurements of the state *q*(*t*) will be taken at times *T*_{j} and that each measurement *j*, *j*=1,…,*M* is disturbed by normal noise with mean *q*(*t*_{j}) and with a fixed variance
*j* is distributed according to
*T*_{j} are assumed to be known. In contrast, we will assume that in addition to the measurement
value error, there is some uncertainty about the exact times where the measurements
have been taken. If we attempt to take the *j*th measurement at the intended (or nominal) time
*T*_{j} which may be shortly before or after the intended time
*T*_{j} is given as a realization of a random variable *T*_{j}. In our example, we assume that *T*_{j} follows a truncated normal distribution given by the density

with normalizing constant

and given intended measurement times
*j* for all possible intended times
*y*_{j} in dependence of the intended measurement time
*x*-axis, while the dark-green dashed line depicts the nominal evolution of the state
*q* over time. Subfigure (a) shows the distribution of the measurement values with time
uncertainties, while (b)-(d) depict the distribution of the measurement values with
known measurement times (in this case
*σ*_{y}: in (b), the original standard deviation is used, while in (c) and (d), higher standard
deviations are used which correspond to the cases with lumped value and time variations.

**Figure 1.** **Assumed measurement distributions for the motivating example.** Measurement distribution resulting from (**a**) separate modelling of measurement time uncertainties and measurement value uncertainties,
with *σ*_{y}=0.005, and (**b**-**d**) lumped time-and-value uncertainties with several different assumed lumped measurement
variances *σ*_{y}. The dashed dark-green line depicts the nominal evolution of the state *q* over time. The green shaded area depicts the region where the measurements are expected.

Comparing Figures 1(a) and 1(b-d), we observe that the distributions of the measurements exhibit clearly different
shapes. For the “true” model depicted in Figure 1(a), if we consider a single point in time that lies in a time segment where the state
values change quickly, the distribution of the measurement at this certain point in
time is quite broad. The variance in the measured value is very high, whereas it is
small in time segments where the state values change slowly. In contrast, for the
standard particle filter, the measurement variance is constant and hence the assumed
measurement distributions differ remarkably from the “true” distributions, howsoever
the value of

### Methods

We divide this section into three subsections. In the first subsection, we fix the state and observation model we want to consider. In the second subsection entitled “Standard case” we outline the standard particle filter algorithm in the context of time-continuous states with time-discrete measurements, and the various probability distributions involved. Although nothing is new in this subsection, it serves several purposes. Firstly, the time-continuous case is relatively rarely considered in the literature; secondly, the derivation of our modification needs a slightly more general formulation than it is standard for the discrete-time filter; and lastly, the comparison of our modified version with the standard case might more clearly reveal the differences between the two approaches. In the third subsection entitled “MTU particle filter”, we present our new modification of the particle filter. In the following section “Results and Discussion”, we compare the new MTU particle filter to the standard particle filter and to an alternative Maximum Likelihood estimation method on a simple artificial example. We also present an application of our MTU-PF method to a PK/PD study in a non-linear mixed-effects setting in direct comparison with the standard particle filter.

Note: a list of all used symbols with a short explanation can be found at the end of this paper.

#### The model

#### State process

Let
*t*∈ [*t*_{0},*∞*) with
*t*∈ [*t*_{0},*∞*) let further

For each *t*∈ [*t*_{0},*∞*), denote by
*X*_{t}, i.e.

the state space restricted to the interval [*t*_{0},*t*], and denote by
*s* and *t* with *t*>*s*≥*t*_{0}, let *K*_{s,t}(*x*_{s}, d*x*_{t}) be the Markov kernel of the process
*s* to time *t*.

An important special case for
*σ*-algebra) defined through a stochastic differential equation (SDE)

with drift *a*(*x*,*t*), diffusion matrix *B*(*x*,*t*), multidimensional standard Wiener process
*K*_{s,t} when a suitable discretization method is applied, for instance the Euler-Maruyama
method.

#### Observations / measurements

Let the process
*M* random variables *Y*_{1:M} with values in measurable spaces
*y*_{j} depends on the state variable
*T*_{j} and on the observation time (measurement time) *T*_{j} itself. We assume that, given the observation time *T*_{j} and the state
*y*_{j} is independent of all other variables, and the conditional measure can be expressed
via some conditional probability density
*g* such as linear dependence on the states or Gaussianity.

#### Observation / measurement times

The observation times (measurement times) *T*_{j} for *j*=1,…,*M* are usually assumed to be deterministically given and known. Our variant of the particle
filter will be based on the assumption that the observation times *T*_{j} are themselves realizations of random variables *T*_{j}. These variables model the uncertainty about exact observation times. In contrast
to the observation variables *y*_{j}, the observation times *T*_{j} are never observed (measured). We assume that all information available to us is
their probability distribution on the half axis [*t*_{0},*∞*), while in the case of the observations *y*_{j}, we know both the densities
*and* the observed values *y*_{j}.

In this article, we will only consider the simplest case where each variable *T*_{j} is independent of all others. Dependencies between the *T*_{j}’s, especially concerning the order of the observation times, may be considered natural
but would lead to more complicated algorithms. However, order dependencies can easily
be introduced via restrictions on the support of the variables. In general, the probability
distribution of every single variable *T*_{j} shall be given by a density *γ*_{j}(*t*_{j}) with respect to the Lebesgue measure
*t*_{0},*∞*).

In the following, we will consider the two cases mentioned, where either all *T*_{j} are deterministic and known or all *T*_{j} are random and unknown. Note that the first case formally coincides with the case
that *T*_{j} is random but observed. We will therefore stick to the notation

#### Standard case: measurement times deterministic and known

We will first consider the standard case, where the observation times *T*_{j} are known. For simplicity, we assume here that the observation times *t*_{1:M} are strictly ordered increasingly, i.e. *t*_{0}<*t*_{1}⋯<*t*_{M}.

The standard case of the particle filter is usually formulated for discrete-time Markov
processes
*t*_{0} and at the times *t*_{1},…,*t*_{M} when measurements occur. Nevertheless, this case is included in our more general
framework where *X*_{t} is defined for all *t*≥*t*_{0}. One just focuses on the state variables for those times only. In view of the later
generalization to random observation times, we will consider the fixed values *T*_{j} as realizations of random variables *T*_{j} and condition all occurring densities on them. As mentioned above this assumption
leads to the same results as if we assumed the values *T*_{j} to be given deterministically.

#### Full model and filter model

The full model is given by the joint density of the variables
*Y*_{1:M} (conditioned on the observation times *T*_{1:M}=*t*_{1:M}) with respect to the product measure

The filter at a given time *t*_{k} is based on a reduced model. This model is given by the joint density of the variables
*Y*_{1:k} (conditioned on *T*_{1:M}=*t*_{1:M}) with respect to the product measure

This density is based on the state sequence
*Y*_{1:k} (given *T*_{1:M}=*t*_{1:M}) with respect to

and the filter density at time *t*_{k} with respect to

with

For general (non-linear) models, the practical computation of the filter density is
very difficult. Nevertheless, the particle filter computes a Monte Carlo approximation
using the fact that the filter densities
*t*_{k-1} given by the probabilities

for each set
*Y*_{1:k-1} (and *t*_{1:M}), by use of the kernel

for each set
*t*_{k}:

for each set

#### Importance sampling

Another ingredient for the particle filter is sequential importance sampling. We assume
that a second Markov chain
*j*=1,…,*M*. We assume that for each

exists. We further assume that the pushforward measure

For sequential importance sampling, we need to be able to sample from the initial
measure

for each

Using

we can then write the recursive formula (8) for the filter distribution at time *t*_{k} as

for each
*Y*_{1:M} is assumed to be fixed) is not necessary. Sequential importance sampling is performed
as follows. Draw a number *N* of realizations

Then, for all *k*=1,…,*M*, sample realizations
*i*=1,…,*N* and compute the unnormalized weights

For suitable integrable functions *h* (e.g. fulfilling some mild restrictions on how fast *h* may increase with *x*, see [23] for details), the expectation of *h* with respect to the filter density conditioned on the observations *Y*_{1:k}=*y*_{1:k}, given by

can then be approximated by

where *N* is the number of particles. In fact, it can be shown that as *N* approaches infinity, these empirical expectations converge to the filter expectations:

Note that if we can sample from the Markov kernels of

#### Resampling

If the number *N* of samples through time is fixed, the samples obtained by sequential importance sampling
quickly degenerate since most of the normalized weights decrease rapidly towards 0.
The degree of degeneracy is often measured by an estimate of the so-called effective
sample size (ESS). This estimate at time *t* is given by

where

are the normalized weights. It obtains its maximal value *N* if all weights are equal, and it approaches 1 if the variance of the weights and
thus the degree of degeneracy increases. To avoid this degeneration of the samples,
a resampling step needs to be done when the ESS drops below a threshold *N*_{Threshold} (which is usually chosen to be *N*/2).

Resampling at some time *s*_{ℓ} is based on given non-negative (unnormalized) selection weights
*i*: One repeatedly selects particles with probabilities

This is multinomial resampling. There exist procedures where each single particle
is still selected with probability
*ι*_{ℓ}:*I*→*I* on the index set *I*: ={1,…,*N*}. Resampling is then done by replacing the state samples
*i* will be chosen is
*i* has been chosen after *N* draws is
*i* needs then to be corrected by replacing it by the weight

(using (16)). The necessary correction is therefore achieved if the unnormalized weights

Note that in the original particle filter, the selection weights
*s*_{ℓ} are chosen to be the particle weights (before the replacement), i.e.

such that after the resampling step the unnormalized weights are all equal to 1. Nevertheless, in general their choice is free and may be based on the observations (which is used in the so-called auxiliary particle filter [26]).

#### Particle filter algorithm

The particle filter computes the state realizations and weights recursively through time. In its standard form, the particle filter can be stated as in algorithm 1.

Note that if one chooses

#### Data Likelihood

Model validation or discrimination is generally based on the data likelihood

*Algorithm 1 Standard particle filter*

for given observations *Y*_{1:k}. Without resampling, the data likelihood could be approximated by the empirical mean
of the unnormalized weights, i.e. by

because this is the empirical estimate for the above expectation. After a resampling
step, this is not valid any longer. Nevertheless, in any case, the data likelihood
can be computed recursively by the following estimate of the ratio

with initial estimate

#### MTU particle filter: Uncertain measurement times

We now assume that each observation time *T*_{j} is a realization of a random variable *T*_{j}. Its distribution is expressed via densities *γ*_{j} with respect to the Lebesgue measure
*T*_{j} themselves are not observed.

#### Full model

The full model in this case will include complete continuous state paths, since the
observation times are now distributed over the complete time axis [*t*_{0},*∞*), and the observations may potentially depend on every state
*t*_{j}∈[*t*_{0},*∞*). Consider therefore the joint density of the variables
*Y*_{1:M} and *t*_{1:M}, with respect to the product measure

#### Filter model

The filter at a given time *t*≥*t*_{0} is again based on a reduced model. This model is given by the joint density of the
following variables:
*t*; further only those variables *y*_{j} for which *T*_{j}≤*t*; and finally *t*_{1:M}. This density is given with respect to the product measure

Note that we cannot use the simple notation of the standard case where for filtering
only the first *k* observations are taken into consideration at time *t*_{k}, since neither the observations are ordered in time nor the times *T*_{j} are fixed in advance. For this reason we have to include all measurements *Y*_{1:M} also into the filter model. Note that even though we use the complete data *Y*_{1:M}=*y*_{1:M} in the notation, only those *y*_{j} have to be known at time *t* for which *T*_{j}≤*t* holds. To avoid confusion, we mark all densities connected to the filter model at
time *t* by a hat superscript (and by the index *t*).

We will now derive formulas for the filter density. Since we assume that the observation
times *t*_{1:M} are not observed, we use marginalization to get the joint density for
*Y*_{1:M} only, which is

with respect to the product measure

then

and further

where the last step is possible because the factor indexed by *j* does not depend on
*j*^{′}≠*j*. For each *j*, we can split the integration by *T*_{j} at the time point *t* into two parts and get

where the last step follows from the fact that *γ*_{j} is a probability density and therefore

holds. Inserting this into (24), we get

With a further marginalization, we get the joint density of *X*_{t} and *Y*_{1:M} for the filter model,

which is with respect to the product measure

where

is the data likelihood with respect to the measure

#### Effective computation of the filter distributions

In the following paragraph, we will show how the densities of the filter distributions given by (26) can be effectively computed. This is the basis for the formulation of our MTU particle filter method.

Let the observations
*t*∈ [*t*_{0},*∞*) and for each *j*∈{1,…,*M*}, we define random variables

by the following system of ODEs

for each *ω*∈*Ω* with initial values

and by

We will show that for each set *A* in the *σ*-algebra generated by the variable *X*_{t}, it holds that

where
*W*_{j,t} and *W*_{t} to compute the filter distributions through time. From this, it follows immediately
that we can also compute filter expectations. Indeed, for any real-valued measurable
function *h* on
*h*(*X*_{t})|]<*∞*, it holds that the expectation of *h*(*X*_{t}) given *Y*_{1:M}=*y*_{1:M} with respect to the filtered state *X*_{t} defined by

is given by the following equation:

To show our assertion, we consider the processes *W*_{j,t} for *j*=1,…,*M*. According to (28) and (29), each *W*_{j,t} is defined as

so, with (30),

holds. Thus, to prove (31), we have to show that for each set *A* from the *σ*-algebra generated by *X*_{t},

holds (see (25) and (26)). It is enough to show the equality for the numerator, i.e.

since the equality of the denominator follows then immediately from the special case
*A*=*Ω* and from the fact that

Using the variable transformation

This is what we wanted to show.

#### Weights

Since for each *t* and for each *j*=1,…,*M* the random variables *W*_{j,t} and *W*_{t} depend only on the process
*t*, we can define functions

It follows from (34) that

for each *j* and from (35) that

The values of *W*_{t} will serve as weights in the MTU particle filter. We will call *W*_{j,t} the partial weights. Since in each discretization scheme which is applied to solve
the integral in the formula (36) for *W*_{j,t} the integrand has to be evaluated, we may run into practical problems if we use it
as it is written in the formula. If the density

where the cumulative distribution function

is independent of

depends on the path
*γ*_{j}, if it is computationally available.

Note that the definition of the filter distribution is dependent on the reference
measure

#### Resampling

Special attention is needed for the computation of the weights after resampling steps
have been applied. As mentioned earlier, resampling at time *s*_{ℓ} is done by randomly generating a selection function *ι*_{ℓ}:*I*→*I* (with index set *I*={1,…,*N*}) based on given non-negative (unnormalized) selection weights
*i*. The state samples
*s*_{1},…,*s*_{ℓ} with *t*_{0}≤*s*_{1}⋯<*s*_{ℓ}≤*t* have occurred, states and weights have been replaced at these times, and within this
paragraph, we denote them by
*i*. By definition, we have at *t*=*s*_{ℓ}

and

For each time *t*≥*s*_{ℓ} and for each particle *i*, the corrected weights are then recursively given by

with

Since the process *W*_{t} computes the uncorrected weights