Dept. of Clinical Neurosciences Cambridge Centre for Brain Repair University of Cambridge, UK

INRIA Rennes - Bretagne Atlantique Campus Universitaire de Beaulieu 35042 Rennes Cedex, France

The Microsoft Research - University of Trento Centre for Computational and Systems Biology Trento, Italy

Abstract

Background

Constructing predictive dynamic models of interacting signalling networks remains one of the great challenges facing systems biology. While detailed dynamical data exists about individual pathways, the task of combining such data without further lengthy experimentation is highly nontrivial. The communicating links between pathways, implicitly assumed to be unimportant and thus excluded, are precisely what become important in the larger system and must be reinstated. To maintain the delicate phase relationships between signals, signalling networks demand accurate dynamical parameters, but parameters optimised in isolation and under varying conditions are unlikely to remain optimal when combined. The computational burden of estimating parameters increases exponentially with increasing system size, so it is crucial to find precise and efficient ways of measuring the behaviour of systems, in order to re-use existing work.

Results

Motivated by the above, we present a new frequency domain-based systematic analysis technique that attempts to address the challenge of network assembly by defining a rigorous means to quantify the behaviour of stochastic systems. As our focus we construct a novel coupled oscillatory model of p53, NF-kB and the mammalian cell cycle, based on recent experimentally verified mathematical models. Informed by online databases of protein networks and interactions, we distilled their key elements into simplified models containing the most significant parts. Having coupled these systems, we constructed stochastic models for use in our frequency domain analysis. We used our new technique to investigate the crosstalk between the components of our model and measure the efficacy of certain network-based heuristic measures.

Conclusions

We find that the interactions between the networks we study are highly complex and not intuitive: (i) points of maximum perturbation do not necessarily correspond to points of maximum proximity to influence; (ii) increased coupling strength does not necessarily increase perturbation; (iii) different perturbations do not necessarily sum and (iv) overall, susceptibility to perturbation is amplitude and frequency dependent and cannot easily be predicted by heuristic measures.

Our methodology is particularly relevant for oscillatory systems, though not limited to these, and is most revealing when applied to the results of stochastic simulation. The technique is able to characterise precisely the distance in behaviour between different models, different systems and different parts within the same system. It can also measure the difference between different simulation algorithms used on the same system and can be used to inform the choice of dynamic parameters. By measuring crosstalk between subsystems it can also indicate mechanisms by which such systems may be controlled in experiments and therapeutics. We have thus found our technique of frequency domain analysis to be a valuable benchmark systems-biological tool.

Background

Introduction

Many problems related to systems biology remain computationally hard (their difficulty increases exponentially with instance size), meaning that a brute force computational approach will only be tractable for small instance sizes. Despite apparently ever-increasing available computational power, in order to take full advantage of computational methods it is still necessary to apply them judiciously. This means balancing the requirements of precision and accuracy and finding meaningful abstractions which optimise them.

Representing signalling networks as dynamical systems of interacting populations of molecules offers the tantalising prospect of being able to predict the future behaviour of such networks by simulation. Precision in the

In order to take advantage of the vast repository of accumulated data and the easy availability of computational power, we have devised an efficient systematic approach that allows automatic analysis and verification of large dynamical models in a meaningful way. Noting that oscillatory behaviour is ubiquitous in biological systems, we present a new automated analysis technique based on frequency domain analysis, able to measure precisely the behaviour (oscillatory or otherwise) of interacting systems. To demonstrate the utility of this approach we apply it to a novel coupled oscillatory model of p53, NF-kB and the mammalian cell cycle. In what follows we first describe the background to the modelling process and explain our methodology in detail, we then present and discuss our results and finally draw conclusions. An additional file contains further background to the modelling and analysis process, plus detailed descriptions of the models we have created.

Biological context

It is well-known that signalling pathways that govern cellular death are of critical importance for normal tissue development, homeostasis and function

Oscillations are necessarily ubiquitous in biology and are found, for example, in the pulse of the heart, the circadian rhythm, in the signal transduction that involves adenosine 3',5'-cyclic monophospate (cAMP) and in the chemotaxis of

In the literature, qualitative descriptions of the components and mechanisms of oscillatory signalling systems have greatly improved our understanding of how cells function and have given insights into their behavioural properties, along with how to intervene therapeutically when such signals are mis-communicated

Technical motivation

Full mathematical analysis of interesting biological systems is usually impractical; the simplifications that are effective for small systems are generally not scalable. Moreover, low dimensional explanations of highly complex behaviour seem to defeat the purpose of constructing large models. For large systems we require a

The model

To demonstrate the ideas and power of the proposed method, we apply it to theoretical models of transcription factors identified to play critical roles in cell differentiation and cell death. Aberrant NF-κB (p50/p105, p52/p100, RelA, c-Rel, RelB), best known for its role in immune and inflammatory responses, is an active growth- and division-promoting transcription factor

We have extended the chosen models to include their involvement with the cell cycle. For example, an immune response to a foreign organism results in the promotion of the target gene cyclin D1; and a response to a high mutation or error rate brought about by DNA damage results in the transcriptional upregulation of target gene p21 via p53 to initiate cell cycle arrest. Cyclin D1 promotes cell cycle progression through G1-phase by forming active holoenzymes with CDK (cyclin-dependent kinase) 4 and CDK6. CDK4 and 6 phosphorylate the Rb (retinoblastoma protein)

Methods

We are principally interested in the interactions of the processes generating oscillation, so our approach is to find simple models which nevertheless capture the fundamental characteristics of their oscillatory behaviour at a mechanistic level. We considered published mathematical models of the IkB-NF-kB

Model creation

Models (networks) taken from the literature and databases often contain elements not crucial to the observed behaviour but included as the valid results of research and experiments. With judicious pruning (see e.g.

For their involvement with the cell cycle, the two pathways were connected via components whose regulation is activated by one pathway but coupled to substrates belonging to the G1/S phase of the cell cycle network. Such components are the promoter activity of cyclin D1 molecules (a protein required for cell cycle progression from the G1 phase to S phase) that have been shown to be activated by NF-kB transcription factor

Stochastic modelling

In designing the linked systems, both deterministic and stochastic methods were utilized. Up-to-date models were taken from the literature in the form of ordinary and delay differential equations. Links were hypothesised based on a literature search and the models were simplified and parameterised using the assumptions outlined above and in Additional file ^{-1 }that was also used to transform the rate constants (see Additional file

**Supplementary material**. The supplementary material contains supplementary results and methods, including details of the mathematical models employed and other examples of the application of frequency domain analysis.

Click here for file

Stochastic simulation

Simulation is a very simple means to get an idea of the behaviour of a dynamical system. In a deterministic framework the evolution of concentration in time produced by numerically solving a set of ODEs is a direct characterisation of its average behaviour, but individual stochastic simulation traces may be quite different from one another. There is often an intuitive notion of average behaviour, apparently related to the solution of the corresponding ODE, but this is merely coincidental. Since such an ODE defines the behaviour of the stochastic system taken to the

The stochastic models we consider here are governed by the chemical master equation (CME, see e.g.

Thus, while the choice of a discrete stochastic framework offers the potential to investigate chemically reacting biological systems in the most precise way, in order to draw general conclusions about a model's behaviour from stochastic simulations it is necessary to characterise some kind of average trajectory that preserves the behaviour. Averaging the time series of multiple stochastic simulation runs, however, does not produce an average trajectory: the amount of a molecular species at a given time point in different simulation runs is a random variable, the distribution of which being defined by the CME. The consequence of this is that averaged oscillatory behaviour of stochastic time series tends to disappear with increasing time because as time progresses the system is less likely to be in a unique state. This is illustrated in Additional file

Statistical measures over frequency spectra

We make multiple simulation runs (100 for the presented results), having identical initial conditions and length of simulated time, and the resulting time series are converted to complex frequency spectra using the discrete Fourier transformation (DFT):

_{ω }^{th }frequency component (of a total of _{n }^{th }(of _{n }_{t }_{t }^{-1}), and the maximum observable frequency ((2^{-1}). To maximise the range and the precision of the analysis it is generally desirable to have large

The result of the DFT is

^{th }component of the amplitude spectrum, _{ω}^{th }component of the complex spectrum resulting from Equation (1). The average amplitude spectrum is then defined:

^{th }component of the average amplitude spectrum, ^{th }component of the amplitude spectrum from the ^{th }simulation run. By thus discarding the average phase information (noting that amplitude and phase are not independent in models of this kind and that phase information encapsulating the causality of individual traces is thus contained in the individual amplitude spectra), it is possible to reveal the average oscillatory behaviour in an intuitive way. We have found the average phase information to be less informative (highly stochastic, with no apparent coherence), although it can be examined independently, if required.

The spectra created in this way form distributions which tend to characterise the observed behaviour in a compact, informative form. Although the frequency spectra contain as many points as a single simulation run and may also contain noise, the processes of transformation and averaging serve to resolve and elucidate the characteristic behaviour. Moreover, we are then able to measure and compare the spectra so produced. In particular, we use a discrete space version of the Kolmogorov-Smirnov (K-S) statistic

The following procedure is used to generate average frequency spectra to characterise a set of simulations for the purpose of visual comparison or analysis of stochasticity.

Procedure A:

1. Perform a number of simulation runs which are long enough to demonstrate a phenomenon of interest.

2. Generate average frequency amplitude spectra for each molecular species:

a. Sample each simulation trace according to

b. Calculate term-wise means of the amplitude spectra according to Equation (3).

3. Iterate 1 and 2, adding new simulations to the average as necessary (e.g., until the average spectra are sufficiently free of noise).

The following procedure is used to measure the difference between alternative systems or alternative simulation algorithms.

Procedure B:

1. Perform a number of pairs of simulation runs, where

a. each pair comprises the two alternative systems/algorithms and

b. the number of runs is designed to take an acceptable amount of time.

2. Generate average frequency amplitude spectra for each molecular species of the alternative systems/algorithms, as per Procedure A 2a and 2b.

3. For each molecular species of interest, calculate

4. Iterate 1-3, adding new simulations to calculate

The number of simulation runs required (

Efficiency

Our analysis methodology scales efficiently with respect to model size (number of different molecular species), especially in comparison to numerical techniques for finding the probability distribution of states in Markov chains (the mathematical structure underlying our stochastic models)

Results and discussion

Our crosstalk experiment considers the vector of change comprising the changes in behaviour of molecular species in the cell cycle resulting from connection to the NF-κB and p53 systems, relative to their behaviour when the external systems are not connected. Precise details of the models we constructed are given in Additional file

Coupled model of cell cycle G1/S phase, p53 and NF-κB

**Coupled model of cell cycle G1/S phase, p53 and NF-κB**. Diagram showing the complete model described in the text, illustrating how molecular species influence those to which they are connected by reactions. A complete mathematical description of the model is given in Additional file

We applied Procedure B (Methods) with pairs of

Perturbation of cell cycle components by p53a and NF-κBn

**Perturbation of cell cycle components by p53a and NF-κBn**. Diagrams illustrate the quantitative influence of external oscillatory networks (not depicted) on cell cycle components (the nodes). White nodes are most perturbed, black nodes least (values in Additional file **A **Perturbation by p53a. **B **Perturbation by NF-κBn. **C **Perturbation by simultaneous influence of p53a and NF-κBn.

Figure

Previous work

Figure

Time and frequency domain representations of the behaviour of NF-κBn, p53a, p21 and CycE-CDK2-p21

**Time and frequency domain representations of the behaviour of NF-κBn, p53a, p21 and CycE-CDK2-p21**. Individual time courses (left) and average frequency spectra (right). **A **Left panel: time courses of stochastically simulated NF-κBn (red) and p53a (blue). Quasi-deterministic time courses superimposed in black. Right panel: average frequency spectra of NF-κBn (red), p53a (blue) and E2F-Rbpp perturbed by NF-κBn alone (black). **B **Evidence of crosstalk in time (left) and frequency domain (right) of p21 in the fully coupled network (red), in comparison to the isolated cell cycle (black). **C **Stochasticity in time (left) and frequency domain (right) of CycE-CDK2-p21 in isolated cell cycle, using quasi-deterministic (black) and fully stochastic models (red).

Figure

Figure

It is immediately apparent from our results that the nature of crosstalk is at times counter-intuitive in terms of causality. For example, the species directly influenced by NF-κBn is only weakly perturbed while the point of maximum perturbation is three steps away from NF-κBn. Such phenomena are perhaps to be expected in coupled non-linear dynamical systems. Nevertheless, we wished to investigate whether there is in fact a simpler explanation of crosstalk, based on network topology, that can be inferred without simulation. In Figures ^{2}) value of 0 indicates that the minimum distance has no predictive power in this case (R^{2 }= 1 being perfect). By including the influence of all possible paths between NF-κBn and cell cycle species the predictive power of the model improves (red). In the case of influence by p53a (Figure ^{2 }= 0.145 vs. R^{2 }= 0.4). In the fully coupled model (Figure

Evaluation of network-based heuristics by frequency domain analysis

**Evaluation of network-based heuristics by frequency domain analysis**. Correlation of minimum distance (black) and weighted network distance (red) with measured perturbation of the cell cycle. **A **Perturbation by p53a. **B **Simultaneous perturbation by p53a and NF-κBn. **C **Perturbation by NF-κBn. R^{2 }is the coefficient of determination and indicates the ability of the heuristic to predict the measurement: R^{2 }= 1 is perfect; R^{2 }= 0 shows no ability.

Thus the prediction afforded by the minimum distance may at times

Conclusions

A key challenge of systems biology is to assemble the disparate information gathered over years of experimentation and research into a coherent whole. To avoid the intractable computational cost of re-parameterising existing models, heuristic techniques, such as those of network analysis, may be employed to simplify the task. To evaluate the performance of these heuristics and verify what is created, efficient, meaningful, high resolution analytic techniques must be developed. This document presents one such: a systematic technique for characterising behaviour and for measuring the interactions and connections between and within signal transduction pathways using frequency domain analysis. We have constructed a novel dynamical model of communicating oscillatory networks of p53, NF-κB and the G1/S phase of the cell cycle and have applied our technique to investigate it. In doing so, our investigation has revealed complex counter-intuitive dependencies and has demonstrated that the methodology is reliable, precise and capable of distinguishing the effects of multiple interactions.

As general conclusions for the model we have found that (i) p21 and CycA-CDK2-p21 are the species most strongly influenced by the p53 network and that the perturbation is primarily at the principal oscillatory frequency of p53a and local to the perturbation; (ii) p21 and CycA-CDK2-p21 are only weakly perturbed by the NF-κB network; (iii) E2F-Rbpp is the species most strongly perturbed by the NF-κB network and the perturbation is indirect and from the low frequency transient of NF-κBn, rather than its higher frequency oscillations; (iv) increased coupling strength tends to reinforce trends in crosstalk; however (v) E2F-Rbpp is moderately perturbed by p53a with single coupling strength,

Quantifying in detail the extent to which molecular species are robust or sensitive to perturbations potentially indicates the mechanisms by which the system may be manipulated in experiments and therapeutics. Strictly, the dependencies we have discovered are features of the models we have used, the simulation algorithm we have chosen and the links we have hypothesised (the standard modeller's proviso). There are clearly many additional interconnections with other pathways that we (and others) have not yet modelled (the published models of the systems we consider here are continually being refined

We have described how our methodology is efficient with respect to the standard numerical techniques used to investigate Markov chains and have observed that, in addition, such techniques are cumbersome in describing behaviour in comparison to ours. To add weight to these claims and as a further demonstration of the utility of our benchmark technique, we have shown the results of investigating two network-based heuristics, finding that they are not adequate in describing the complex frequency-dependent interplay in our model and may give misleading results. It is important to note here that our methodology is a precise means of measuring and comparing simulation time series and that it has no obvious inherent prejudice with respect to the type of model or means of simulation. There are practical considerations, relating to the efficacy and precision of numerical algorithms, which make certain combinations of model and simulation algorithm infeasible, but these considerations are independent of our methodology. In our investigation of the cell cycle - p53 - NF-κB system, we have used an exact stochastic simulation algorithm, but have chosen to investigate both a model which is, as far as possible, reduced to elemental reactions (thus modelling the supposed real physical process) and one which is essentially a stochastic interpretation of the differential equations (perhaps only weakly related to physics). While the qualitative differences between these two cases is clear, our methodology is able to provide a

Our focus has been stochastic models, but there are well-established techniques used to investigate the dynamics of deterministic systems that can be seen as potential alternatives to our methods (ignoring their fundamental limitation of not considering variance).

Given the vast repository of individual models in the literature and in online databases that await combination and validation, we have shown that our methods have great potential for application in systems biology. We also envisage further improvements and refinements to our techniques. Biological systems often contain processes working at orders of magnitude different scales of time and size. Although transformation into the frequency domain has here proved to be both effective and intuitive, in order to integrate and analyse large

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

AECI created the models and co-wrote the manuscript. SAS performed the simulations and analysis and co-wrote the manuscript. The authors read and approved the final manuscript

Acknowledgements

This work has been partially funded by FIRB Project RBPR0523C3 (AECI) and by Fondazione CAPIPLO and Fondazione CARITRO under the NOBEL Project (SAS). The authors wish to thank colleagues at COSBI, Ivan Mura, Attila Csikasz-Nagy and Matteo Cavaliere, as well as external colleagues Neil Perkins and Stefano Pluchino, for valuable discussions.