Division of Computational Biology, Program in Chemical Safety Sciences, The Hamner Institutes for Health Sciences, Research Triangle Park, NC 27709, USA

Abstract

Background

The image of the "epigenetic landscape", with a series of branching valleys and ridges depicting stable cellular states and the barriers between those states, has been a popular visual metaphor for cell lineage specification - especially in light of the recent discovery that terminally differentiated adult cells can be reprogrammed into pluripotent stem cells or into alternative cell lineages. However the question of whether the epigenetic landscape can be mapped out quantitatively to provide a predictive model of cellular differentiation remains largely unanswered.

Results

Here we derive a simple deterministic path-integral quasi-potential, based on the kinetic parameters of a gene network regulating cell fate, and show that this quantity is minimized along a temporal trajectory in the state space of the gene network, thus providing a marker of directionality for cell differentiation processes. We then use the derived quasi-potential as a measure of "elevation" to quantitatively map the epigenetic landscape, on which trajectories flow "downhill" from any location. Stochastic simulations confirm that the elevation of this computed landscape correlates to the likelihood of occurrence of particular cell fates, with well-populated low-lying "valleys" representing stable cellular states and higher "ridges" acting as barriers to transitions between the stable states.

Conclusions

This quantitative map of the epigenetic landscape underlying cell fate choice provides mechanistic insights into the "forces" that direct cellular differentiation in the context of physiological development, as well as during artificially induced cell lineage reprogramming. Our generalized approach to mapping the landscape is applicable to non-gradient gene regulatory systems for which an analytical potential function cannot be derived, and also to high-dimensional gene networks. Rigorous quantification of the gene regulatory circuits that govern cell lineage choice and subsequent mapping of the epigenetic landscape can potentially help identify optimal routes of cell fate reprogramming.

Background

The biologist Conrad Hal Waddington, in the course of a career spanning four decades (1930s - 1970s), attempted a bold synthesis of the fields of genetics, embryology and evolution

Mapping Waddington's epigenetic landscape

**Mapping Waddington's epigenetic landscape**. **(A) **The "epigenetic landscape" proposed by Conrad Waddington shows a ball rolling down valleys separated by ridges on an inclined surface, as a visual metaphor for the branching pathways of cell fate determination. Figure reproduced from original text by Waddington **(B) **The computed epigenetic landscape for a two-gene (

In the quantitative view of a cell as a dynamical system governed by genetic interaction networks

Huang, Wang and colleagues have recently proposed a probabilistic "pseudo-potential" to quantify the epigenetic landscape for a gene network regulating cell fate, where the elevation of the surface is inversely related to the likelihood of occurrence of a particular state in phase space

Here we propose a simple numerical method to map the epigenetic landscape that is not based on a probabilistic or master-equation approach. Instead, a quasi-potential surface (Figure

Finally, we discuss ways in which this quantitatively mapped landscape may help predict the efficiency of cellular de-differentiation or trans-differentiation, and identify optimal routes of cell fate reprogramming. Recent discoveries have challenged the dogma of cell fate determination as a unidirectional and irreversible process. Even terminally differentiated adult cells have now been shown to retain considerable phenotypic plasticity and the ability to be reprogrammed into pluripotent stem cell-like states

Results and Discussion

Derivation of the quasi-potential landscape

We first illustrate our quantitative approach with a simple circuit of two genes **Methods**). This circuit works as a toggle switch with two stable steady states: one state with high

If we were able to derive a closed-form potential function

then the local minima on the two-variable potential surface

In general, condition (3) will not be valid for an arbitrary circuit of two genes

Therefore, given that a gene circuit is in general a _{
q
}that changes incrementally along a trajectory followed by the system in

Computing the epigenetic landscape for a bistable switch based on a double-negative feedback circuit of two genes

**Computing the epigenetic landscape for a bistable switch based on a double-negative feedback circuit of two genes x and y**.

where Δ_{
q
}, to emphasize its distinction from a closed-form potential function.

The change in the quasi-potential, Δ _{
q
}, can be rewritten from Eq. 4 as:

For positive increments in time Δ_{
q
}is thus always negative along an evolving trajectory, ensuring that trajectories flow "downhill" along a putative "quasi-potential surface". Stable steady states of the system (_{
q
}= 0 (per Eq. 5). The overall change in the quasi-potential along a trajectory can then be calculated by numerically integrating the quantity Δ _{
q
}in Eq. 4 from a given initial configuration up to a stable steady state, thereby allowing us to map out a temporal trajectory along the putative quasi-potential surface (Figure

The procedure described above was repeated to evaluate the change in the quasi-potential along trajectories originating from different points in

Ridges and valleys on the computed epigenetic landscape of a bistable (A, B) and a tristable (C, D) regulatory network of two genes

**Ridges and valleys on the computed epigenetic landscape of a bistable (A, B) and a tristable (C, D) regulatory network of two genes x and y**. The alignment of trajectories produces the "ridges" on the epigenetic landscape (indicated by arrows in panels

The same procedure can be applied to systems with more than two stable steady states - for instance, a "tristable" system produced by a circuit of two genes that induce their own expression, in addition to mutual inhibition (Figure

Quantitative interpretation of the quasi-potential landscape

To establish that the "elevation" of the computed landscape at a given location in

Valleys on the computed epigenetic landscape represent high-occupancy stable steady states, while ridges represent barriers to stochastic transitions between those stable states

**Valleys on the computed epigenetic landscape represent high-occupancy stable steady states, while ridges represent barriers to stochastic transitions between those stable states**. For the tristable two-gene system, increasing the Hill coefficient _{H}, which represents the degree of ultrasensitivity in autoregulation and mutual inhibition of the two genes (see **Methods**), makes the ridges (barriers) higher and steeper relative to the valleys (attractors). Higher ridges reduce the probability of stochastic switching among adjacent attractors. **(A) **_{H }= 2; (**B) **_{H }= 3; **(C) **_{H }= 4; **(D) **_{H }= 10. **Left Panels**: Colored circles represent a population of 1000 stochastically simulated "cells" residing in the three stable steady states **Middle Panels**: Projections of the epigenetic landscape onto the **Right Panels**: An alternative view of the epigenetic landscape. The vertical dashed red lines are guides to the eye to show that the relative distance between the steady states on the x-y phase plane does not change appreciably even as the Hill coefficient _{H }is increased from 2 to 6. The change in relative occupancy of the attractors can therefore be attributed to the increased height and steepness of the barriers separating them.

Height and steepness of barriers affects stochastic occupancy of stable states

**Height and steepness of barriers affects stochastic occupancy of stable states**. **(A) **Percentage of stochastically simulated cells in the three attractors _{H}. All simulations were started from state **(B, C) **With increasing _{H}, the height and steepness of the ridges relative to the valleys is increased, making stochastic transitions (arrows) from state **(B) **_{H }= 2; **(C) **_{H }= 6.

The "third dimension" (elevation) of the landscape represented by the quasi-potential, although directly derived from the dynamic rate equations without any additional information, thus yields an interpretation of cellular stability not immediately apparent from two-dimensional phase portrait analysis. The analysis above supports the contention that the length of the "least action trajectory" along the contours of the epigenetic landscape is more important in predicting transitions between alternative cellular states than the simple "aerial distance" in state space

These results suggest that calculating the relative heights of the ridges and valleys on the computed epigenetic landscape of a multi-gene system can help predict the probability of trans-differentiation from one cell lineage to another, or de-differentiation of a particular cell type to its progenitor state. Current efforts to reprogram cell fate with potential application in regenerative medicine suffer from a low rate of successful reprogramming

**Supplementary Figures**. This file includes additional figures to supplement the text.

Click here for file

A dynamic landscape

The computed epigenetic landscape derived above should not be interpreted as a static surface

The shape of the computed epigenetic landscape can be altered by modifying gene interaction parameters

**The shape of the computed epigenetic landscape can be altered by modifying gene interaction parameters**. When basal expression _{y }of gene **Methods**) is increased from _{y }= 0 **(A) **to _{y }= 4 **(B) **(dimensionless units), attractor _{H }= 10 in both figures.

Interestingly, this flexibility of the quasi-potential surface under gene manipulation gives a quantitative interpretation of the revised image of the epigenetic landscape proposed by Waddington (Figure S3, Additional File

Conclusions

In this work, we have defined a deterministic quasi-potential that is minimized along a temporal trajectory followed by a gene network, and used it to quantitatively derive the corresponding epigenetic landscape. A gene network not being a mechanical system, this quasi-potential should not be confused with a potential energy function. It is rather a Liapunov function of the dynamical system represented by the gene network, along which trajectories flow monotonically "downhill" towards the steady states of the network _{
q
}in Eq. 4 to calculate the "energy landscape" for concentrations of one component in a gene network

This novel and simple process for deriving the surface of the landscape from a path-integral quasi-potential is not restricted to two-gene systems. While the landscape cannot be visually rendered for circuits with more than two genes, the rates of transition across the potential barriers between multiple steady states in the system can still be computed to predict optimum routes of cell fate reprogramming.

However, many binary branching points in development, particularly in blood cell lineage specification, are governed by mutual antagonism of only two transcription factors associated with alternative lineage choices

Methods

Bistable network model

To illustrate the derivation of the epigenetic landscape, we used a simplified mathematical model of a bistable network of two genes,

where variables _{
X
}and _{
Y
}denote the basal (constitutive) expression rates of genes _{
YX
}and _{
XY
}represent the rate constants, and _{
DYX
}and _{
DXY
}the effective affinity constants, for the suppressive effects of gene _{
H
}(the interaction is _{
H
}> 1). Parameters _{
X
}and _{
Y
}represent the first-order degradation rate constants for the two gene products _{
YX
}= _{
XY
}= 2; _{
DYX
}= 0.7; _{
DXY
}= 0.5; _{
X
}= _{
Y
}= 0.2; _{
X
}= _{
Y
}= 1; _{
H
}= 4. These values were tuned to ensure bistable switching behavior in the model.

Tristable network model

The tristable network model consisted of two genes,

where the new parameters _{
XX
}and _{
YY
}represent the rate constants, and _{
DXX
}and _{
XYY
}the effective affinity constants, for the positive autoregulation of genes _{
XX
}= _{
YY
}= _{
YX
}= _{
XY
}= 10; _{
DXX
}= _{
DYY
}= _{
DYX
}= _{
DXY
}= 4; _{
X
}= _{
Y
}= 0; _{
X
}= _{
Y
}= 1; _{
H
}= 4. This system has been modeled previously

Integration Algorithm

To evaluate the change in the quasi-potential along each trajectory in

Thereafter, at each time step:

• The rates

• Expression levels

where for increments in time Δ

• The quasi-potential _{
q
}was updated as:

where:

The above steps were repeated until the quasi-potential _{
q
}converged to a minimum (decided by a pre-set tolerance). Multiple trajectories thus obtained were aligned into basins of attraction according to the process described in the main text. The quasi-potential surface was then derived by linear interpolation among the aligned trajectories.

Software platforms used

The deterministic models were implemented and simulated on the MATLAB^{® }(R2009a, The MathWorks, Inc., Natick, MA) platform, while the BioNetS program ^{®}.

Visualization of stochastic simulation results

The stochastically simulated "cells" (i.e. individual realizations of the stochastic network model) were overlaid on the quasi-potential surface at the ^{® }function that draws pseudorandom values from the standard uniform distribution on the open interval (0,1)] to each simulated ^{® }format is appended in Additional File

**Supplementary Model Code**. This file lists the source code in MATLAB^{® }format for the computational algorithm used to derive the epigenetic landscape.

Click here for file

Authors' contributions

SB designed the study, constructed the computational model and performed computer simulations, and wrote the paper. QZ participated in study design. QZ and MEA discussed the results and commented on the manuscript. All authors read and approved the final manuscript.

Acknowledgements

We thank J. D. Schroeter, C. Woods, R. B. Conolly, H. J. Clewell III, J.E. Trosko, M. Thattai, J. M. Haugh and B. Howell for critical discussions and reading of the manuscript. This work was supported by the Superfund Research Program of the U.S. National Institute of Environmental Health Sciences, and by the American Chemistry Council.