Modeling the evolution of a classic genetic switch

Josephides, Christos; Moses, Alan M

doi:10.1186/1752-0509-5-24

Research article
Open access
Published: 05 February 2011

Modeling the evolution of a classic genetic switch

Christos Josephides¹ &
Alan M Moses¹

BMC Systems Biology volume 5, Article number: 24 (2011) Cite this article

7555 Accesses
5 Citations
Metrics details

Abstract

Background

The regulatory network underlying the yeast galactose-use pathway has emerged as a model system for the study of regulatory network evolution. Evidence has recently been provided for adaptive evolution in this network following a whole genome duplication event. An ancestral gene encoding a bi-functional galactokinase and co-inducer protein molecule has become subfunctionalized as paralogous genes (GAL1 and GAL3) in Saccharomyces cerevisiae, with most fitness gains being attributable to changes in cis- regulatory elements. However, the quantitative functional implications of the evolutionary changes in this regulatory network remain unexplored.

Results

We develop a modeling framework to examine the evolution of the GAL regulatory network. This enables us to translate molecular changes in the regulatory network to changes in quantitative network function. We computationally reconstruct an inferred ancestral version of the network and trace the evolutionary paths in the lineage leading to S. cerevisiae. We explore the evolutionary landscape of possible regulatory networks and find that the operation of intermediate networks leading to S. cerevisiae differs substantially depending on the order in which evolutionary changes accumulate; in particular, we systematically explore evolutionary paths and find that some network features cannot be optimized simultaneously.

Conclusions

We find that a computational modeling approach can be used to analyze the evolution of a well-studied regulatory network. Our results are consistent with several experimental studies of the evolutionary of the GAL regulatory network, including increased fitness in Saccharomyces due to duplication and adaptive regulatory divergence. The conceptual and computational tools that we have developed may be applicable in further studies of regulatory network evolution.

Background

Regulatory networks are known to underlie many biological processes, and therefore their characterization and analysis forms a central focus of systems biology [1–4]. Despite their importance, relatively little is known about how regulatory networks are formed during evolution and shaped by natural selection.

One of the best studied regulatory networks in molecular biology is the "GAL network", which is responsible for the inducible metabolism of galactose in budding yeast. In addition to being extremely well-characterized in S. cerevisiae[5–7] it has also been the subject of a number of quantitative modeling efforts [8–11] and evolutionary studies, which have revealed many interesting patterns of regulatory network evolution [12–15]. Perhaps most general of these evolutionary paradigms is the duplication and divergence of function of an ancestral bi-functional gene. Before the whole genome duplication (WGD), which occurred along the lineage leading to S. cerevisiae[16, 17], the GAL network is thought to have employed the Gal1/3 protein to perform both an enzymatic and regulatory function. This bi-functional protein has been retained in species that diverged before the WGD [18]. Following the WGD, the two gene copies of GAL1/3 were converted to a highly inducible enzyme (GAL1) [19] and a weakly inducible regulatory protein (GAL3) [20] in post-WGD species. Recent work has demonstrated that these changes confer a growth advantage in galactose [21], and suggested that gene duplication and subsequent neofunctionalization represented an example of how a regulatory network can overcome an adaptive conflict, whereby the network cannot improve in one aspect without impairing function in another.

Here we set out to explore the consequences of evolutionary changes in the GAL network using a model of the regulatory network (Figure 1) to relate specific sequence changes to changes in quantitative function. Starting with experimentally characterized regulatory differences between the pre- and post-WGD GAL genes (see Methods), we inferred an ancestral organization of the regulatory network (see Results). Simulations of the ancestral and S. cerevisiae networks show that gene duplication and specialization lead to elevated gene expression in the presence of galactose, and decreased gene expression in the absence of galactose, thus leading to an improved switch-like system in S. cerevisiae. We then use the model to explore the significance of the order in which a set of maximally parsimonious evolutionary events separating a post-WGD ancestor and S. cerevisiae occur, and find important consequences for the function and evolution of the switch. We introduce the idea of evolutionary paths in the space of possible regulatory networks and develop quantitative measures to compare paths. We find that there are evolutionary paths in the lineage to S. cerevisiae that optimize particular features of their constituent network intermediates, some of which have been shown to be directly related to fitness. Perhaps more importantly, we find that it does not seem possible to optimize all network features in any single path.

Results

A hybrid stochastic-deterministic modeling framework for examining the evolution of the GAL regulatory network

In order to explore the functional consequences of evolutionary changes in the GAL network, we sought a quantitative approach to relate these molecular changes to changes in network operation. Because the evolutionary events of interest include variations in gene copy-number, changes in protein function, as well as changes in the number and spacing of cis-regulatory sequences, we required a modeling framework in which we could incorporate all these features and relate them to network behavior.

To model transcription/translation we modified and implemented a general physicochemical model of transcription regulation [22, 23] in a stochastic context. This allowed us to vary the regulatory features of promoters driving the GAL genes and hence to predict the effects of evolutionary changes in the cis- regulatory sequence of a particular promoter. To model changes in gene copy number, we simply changed the corresponding probability of transcription, for example, to model a gene duplication event, we multiplied the probability of that gene's transcription by two. Because of the difference in time scales and species numbers between transcription/translation reactions and protein-protein/galactose interactions [24], we modeled the latter using a system of ordinary differential equations. In order to model the effect of evolutionary changes in protein function, we assigned different sets of reaction possibilities to new molecular species created during the evolution of the network.

Hybrid approaches [25–28] such as this permit the inclusion of gene expression noise [29, 30], multiple time scales, and variable system mass in biological models while making simulations computationally feasible. Briefly, each step of the hybrid simulation algorithm consists of a stochastic part for the slow transcription/translation/degradation processes, and a deterministic part for the fast molecular interactions which is solved to equilibrium (Figure 2a, see Methods section for a detailed description of the simulation algorithm).

Model parameters were chosen to reproduce the quantitative operation of the GAL network in S. cerevisiae ([5, 6, 9, 21], see Methods for details of parameter choice and parameter sensitivity). Typical traces from simulations are shown in Figure 2b. Protein distributions were obtained at equilibrium from 100 independent simulations of each variant network at constant extracellular galactose concentrations in the range [10^-8, 10^-1] M in the absence of glucose (Figure 2c). The induction-response curve for S. cerevisiae in increasing concentrations of galactose displays the characteristic switch-like response (Figure 2d) and fold-inductions in gene expression are consistent with previously published experimental data (Table 1 in Methods).

Table 1 Simulation and literature results for the GAL network in S. cerevisiae

Full size table

Reconstructing the operation of an inferred ancestral network

GAL1 and GAL3 are paralogous genes which have diverged after a WGD event in the lineage leading to S. cerevisiae from a common bi-functional ancestor, GAL1/3, which serves as both the galactokinase and co-inducer molecule. It has recently been shown that the fitness of cells growing in galactose is related to the phenotype and operation of the GAL network. The cis-regulatory evolution of GAL1 and GAL3 has led to a more efficient genetic switch by subfunctionalizing the processes and regulation of galactose phosphorylation and induction [21].

To study the effects of these changes on the network's quantitative function we inferred an ancestral form of the network by removing the GAL1 and GAL3 genes from the S. cerevisiae network and substituting a bi-functional GAL1/3 gene driven by the ancestral promoter for KlacGAL1[21]. We have assumed that the other ancestral regulatory genes have remained unchanged in function and copy-number compared to S. cerevisiae.

We find that, relative to the inferred ancestor, the number of Gal1p (galactokinase) molecules in S. cerevisiae is increased in high galactose and reduced in very low galactose, which implies a more switch-like response in (105 fold in S. cerevisiae vs. 6 fold in the ancestor). At the same time, the number of Gal3p (inducer) molecules is decreased in all conditions relative to the ancestral protein (e.g., at very low galactose, there are ~5000 molecules in S. cerevisiae vs. ~27000 in the ancestor, Figure 2d). Thus, a single bi-functional gene seems to perform poorly in controlling the enzymatic and regulatory aspects of the network, consistent with experimental results [21].

The effect of an evolutionary change on the quantitative function of the GAL network depends on the network's evolutionary history

Having established the quantitative differences in network function between the ancestral and extant GAL network in S. cerevisiae, we next sought to explore the significance of the order of evolutionary events after the WGD on network operation.

Five events were considered: the regulatory and functional divergence of the two GAL1/3 copies, and the loss of the GAL2, GAL4, and GAL80 gene duplicates, according to the principle of maximum parsimony [31]. Since we know that there was an ancestral genome duplication event, we expect each gene to have been at two copies in some ancestor. We then make the maximally parsimonious assumption that each other gene was lost once. This is the minimum number of evolutionary changes that could have occurred, though not necessarily the scenario that was actually followed. Similarly, we know that there are three active promoter binding sites in the extant GAL1 promoter, and three in the ancestral GAL1/3 promoter, so we assume that were no additional binding site gains and losses during the specialization of GAL1/3 to GAL1. Likewise, we know that the GAL3 promoter has a single binding site, so we assume that two sites in GAL1/3 were lost without any additional intervening binding site gains or losses. Finally, since there is little data from other species to estimate parameter changes in the GAL network, we have assumed that these either remained unchanged or have incurred small changes (see Methods for parameter perturbation experiments).

We simulated all 33 of the possible network configurations that can arise as combinations of the regulatory changes required for the ancestral network configuration to evolve to the extant configuration in S. cerevisiae (1 pre-WGD ancestor + 1 post-WGD ancestor + 5 configurations with one event + 10 configurations with two events + 10 configurations with three events + 5 configurations with four events + 1 S. cerevisiae. See Table 2). These networks link S. cerevisiae and its ancestor via 120 unique evolutionary paths. Each path consists of 7 networks which are connected via the WGD event and a unique sequence of five evolutionary changes (Figure 3a).

Table 2 Configurations for the GAL network variants modeled in this study

Full size table

To evaluate the quantitative function of a network we considered the number of protein molecules at equilibrium in very low galactose (10^-8M, the "uninduced state") and conditions of very high galactose (0.1 M, the "induced state"). We defined three network features of interest for further examination:

Feature 1 "repression strength", is defined as the number of galactokinase molecules (Gal1p, Gal1/3p) in the uninduced state (10^-8 M galactose), such that a smaller number indicates better repression;

Feature 2 "induction strength", is defined as the number of galactokinase molecules in the induced state (0.1 M galactose), such that a larger number indicates stronger induction;

Feature 3 "switch effectiveness" is defined as the number of co-inducers (Gal3p, Gal1/3p) in the uninduced state (10^-8 M galactose), such that a smaller number indicates a more effective switch.

Changes in gene copy-number, protein function, and the number and arrangement of cis- regulatory sequences often resulted in significant changes in the network's quantitative function, as indicated by the differences in the steady-state galactokinase and co-inducer concentrations between the 33 network configurations (Figure 3b). The magnitude and direction of these changes, however, depended both on the particular evolutionary event being effected as well as on the configuration of the network being changed - that is, on the history of evolutionary changes that a network has accumulated. For example, we found that loss of the GAL80 repressor gene duplicate has a significantly smaller impact on repression strength if it occurs prior to the specialization of both copies of GAL1/3 (data not shown). This is because Gal80p-mediated repression of GAL1 is much stronger than that of GAL1/3 due to differences in the promoter regions of the two genes.

Quantitative assessment of the order of evolutionary changes reveals that there are evolutionary paths that optimize specific network features

All of the evolutionary paths in the space of possible regulatory networks terminate at S. cerevisiae and, as a consequence, eventually accrue identical changes in function relative to the inferred ancestral network. Nevertheless, because the effect of an evolutionary change depends on the configuration of the network at that point in evolution, intermediate GAL networks leading to S. cerevisiae occupy markedly different regions in functional space depending on the sequence in which evolutionary changes accumulate.

For each of the 120 evolutionary paths we applied a scoring scheme (Figure 4a) which penalizes an evolutionary event at that intermediate if it does not result in the best possible change in the network feature being scored (see Methods for a detailed explanation of our scoring scheme). We find large variations in these scores (Figure 4b) confirming that different sequences of evolutionary changes lead to different quantitative network function in the evolutionary intermediates. Of the three features considered, induction strength shows the smallest variation, perhaps indicating that there is the least potential for evolution in this axis (see Discussion).

The path that optimizes repression strength (0 penalty) consists of the sequence: specialization of GAL3, specialization of GAL1, loss of GAL4 duplicate, loss of GAL2 duplicate, loss of GAL80 duplicate (red path in Figure 3a, b). The sequence: specialization of GAL1, specialization of GAL3, loss of GAL4 duplicate, loss of GAL2 duplicate, loss of GAL80 duplicate, optimizes switch effectiveness. The sequence: specialization of GAL1, loss of GAL2, loss of GAL80, specialization of GAL3, loss of GAL4 (green path in Figure 3a, b) optimizes induction strength.

We found that evolutionary paths where the two specialization events of GAL1/3 to GAL1 and GAL3 occur in close sequence perform well in maximizing repression strength and switch effectiveness for intermediate GAL networks. Path scores in these two features are negatively correlated (R² = 0.85) with the number of evolutionary events separating the specialization of GAL1 and GAL3 (Figure 5a). In addition, paths where the second copy of GAL80 is lost after all other evolutionary changes show significantly higher repression strength compared to other paths (-0.34+/-0.19 vs. -1.96+/-1.05, P = 0.001, two-sample t-test, Figure 5b). It was also found that the group of paths where specialization of GAL1/3 takes place before any gene duplicates are lost has a higher average switch effectiveness score than paths where duplicates are lost prior to specialization (-0.16+/-0.12 vs. -1.24+/-0.57, P < 0.001, two-sample t-test, data not shown). Finally, we find a correlation (R² = 0.72) between how early GAL1 specializes and the strength of induction (Figure 5c), as well as a weaker correlation with how late GAL3 specializes and the induction strength scores (R² = 0.34, data not shown).

We next explored the relationship between the scores of the different features. Evolutionary paths tend to maximize switch effectiveness scores and repression strength scores together (R² = 0.72, data not shown). Interestingly, however, paths cannot optimize induction strength and switch effectiveness/repression strength simultaneously (Figure 5d). As a result, evolutionary paths may optimize switch effectiveness/repression or strength of induction, but not both features at the same time.

In the preceding sections, we have made the assumption that the parameters of the network remain the same as those observed or inferred in S. cerevisiae. However, in order to test whether the results presented above are sensitive to parameters, we have performed a series of perturbation experiments in which parameters were randomly modified during evolution and found that the observations presented were all recapitulated in the perturbed networks (see Methods).

Discussion

Using a quantitative model of the regulatory network we were able integrate the wealth of experimental knowledge about the S. cerevisiae GAL network with promoter swap experiments and to infer and simulate the operation of an ancestral form of the GAL network. Two of our observations are consistent with the report of increased fitness after promoter specialization [21]. First, our observation of a higher expression of GAL1 relative to the ancestor could lead to an elevated metabolic capacity (galactose flux through the galactokinases) for S. cerevisiae in high galactose, and therefore perhaps to increased growth rate. Second, we observed a decrease in the number of protein molecules in the absence of galactose, which could lead to increased growth in the absence of galactose through a decrease in the energetic cost of protein synthesis (smaller number of protein molecules) (see [32] for an account of cost-benefit analysis of gene expression in a different system).

Despite the apparent consistency between our results and the report of adaptive specialization along the S. cerevisiae lineage, we found it difficult to establish a rigorous relationship between the quantitative function of the network and the network's fitness landscape. For example, in order to understand the potential benefit of increased levels of GAL1 expression, it is necessary to consider the impact of the galactokinase on the entire metabolic network [33]. In particular, consideration of the toxicity of the metabolic intermediates is important and greatly complicates this analysis [34]. Perhaps even more difficult is that we do not know the relative importance to the population of the induced or uninduced states of the GAL network. Clearly, the amount of time that S. cerevisiae may spend exposed to galactose depends on the environment and history of the population. It is unclear if this information will ever be available to us.

Analysis of evolutionary paths

Our scoring of evolutionary paths in the functional space indicates that most potential phenotypic divergence takes place in the uninduced state (repression strength and switch effectiveness, Figure 5b). This is consistent with experimental findings that report a greater effect of cis-regulatory variation on gene repression than on induction [35]. We also found that no evolutionary path takes optimal steps (steps which incur no penalty) in every functional dimension at every network intermediate. In particular, we found that paths that maximized repression strength and switch effectiveness were sub-optimal with respect to induction strength.

Interestingly, we note that GAL networks in other extant species have specialized their GAL1/3 homologues, but have retained various duplicates. For example, S. bayanus and S. castellii have specialized forms of GAL1 and GAL3, but S. bayanus has kept two GAL80 copies and reacquired a tandem GAL2 duplicate [15], while S. castellii has kept two GAL4 and GAL80 copies [36]. According to our evolutionary path assessment, the specialization of GAL1 and GAL3 before the loss of regulatory gene duplicates, as well as the close proximity of these specialization events, improves switch effectiveness and repression strength. Retention of the GAL80 duplicate in both species is also consistent with better repression strength scores (Figure 4a, b).

We speculate a possible explanation for these observations. The GAL network allows for a larger phenotypic divergence in the uninduced state (consistent with our observations of larger variance in path scores pertaining to switch effectiveness and strength of repression). The network's capacity to improve and, therefore, the growth advantage, is greater if networks follow the path of uninduced state optimization.

Assumptions of the approach

Our method of exploring the evolution of the GAL network is based on several assumptions. First, in constructing the evolutionary paths leading to S. cerevisiae, we assume that the evolutionary distance between two GAL network intermediates that are connected by a regulatory change is always constant - that is, we treat all evolutionary changes as equally likely to occur at each step of a path. Second, we make the assumption that regulatory elements not present in S. cerevisiae were also not present in the ancestor. While network features not present in S. cerevisiae, such as a unique UAS configuration on a gene promoter, could have existed in the ancestor or intermediate networks, we have constrained the network's regulatory space to include only those evolutionary events required under the assumption of maximum parsimony - which states that, when confronted with multiple evolutionary scenarios that explain some data, the scenario that requires the fewest evolutionary events is the most probable to have been followed [31]. Third, while we attempt to model parameter perturbations during network evolution, we cannot exclude the possibility that mutations can occur that have very large effects on parameter values. Potentially, experimental characterization of the GAL network in other species [37] could give an indication of such events, which could then be modeled as distinct evolutionary changes in the history of the GAL network.

Conclusions

A hybrid stochastic-deterministic modeling framework has been used to explore the effect of regulatory divergence in the GAL network following the whole genome duplication. We have shown that some evolutionary paths optimize distinct features of network operation and that intermediate GAL networks in the lineage to S. cerevisiae could not have optimized all network features at the same time.

Methods

Hybrid stochastic-deterministic simulation algorithm

Our model of the GAL regulatory network contains 18 molecular species with extracellular galactose acting as the input to the system (Table 3). Simulations of the GAL network were performed using a hybrid stochastic-deterministic algorithm where each iteration of the algorithm consists of a stochastic part and a deterministic part (the latter of which may be skipped as an approximation method).

Table 3 Molecular species used in the GAL network models

Full size table

1. Deterministic part

The first part of the algorithm begins with an algebraic step which models galactose transport to the cell interior:

[g a l_{i n}] = [g a l_{o u t}] (a + b \cdot g a l 2 p),

where we used a = 0.001 and b = 0.01 as transport coefficients for the non-specific hexose transporters and galactose permeases (Gal2p) respectively. In the above equation, and in equations that follow, molecular names in square brackets refer to the concentration of that species, whereas names without brackets refer to that species' number of molecules.

Protein-galactose and protein-protein interactions involved in signal transduction are modeled as a system of ODEs:

\frac{\partial}{\partial t} [G a l] = M [G a l],

M is the mass matrix specified by the following equations:

[g a l_{i n}] + [G a l 3 p_{i}] ⇌_{k_{3}^{-}}^{k_{3}^{+}} [G a l 3 p_{a}],

k_{3}^{+} = 10, k_{3}^{-} = 1

\begin{matrix} [g a l_{i n}] + [G a l 13 p_{i}] ⇌_{k_{13}^{-}}^{k_{13}^{+}} [G a l 13 p_{a}], \\ k_{13}^{+} = 10, k_{13}^{-} = 1 \end{matrix}

\begin{matrix} [G a l 3 p_{a}] + [G a l 80 p] ⇌_{k_{380}^{-}}^{k_{380}^{+}} [G a l 380 p_{a}], \\ k_{380}^{+} = 10, k_{380}^{-} = 1 \end{matrix}

\begin{matrix} [G a l 13 p_{a}] + [G a l 80 p] ⇌_{k_{1380}^{-}}^{k_{1380}^{+}} [G a l 1380 p_{a}], \\ k_{1380}^{+} = 10, k_{1380}^{-} = 1 \end{matrix}

\begin{matrix} [G a l 80 p] + [G a l 4 p] ⇌_{k_{480}^{-}}^{k_{480}^{+}} [G a l 480 p], \\ k_{480}^{+} = 1, k_{480}^{-} = 1 \end{matrix}

Where the k⁺ and k^- are the forward and reverse kinetic rates respectively for each reaction. Steady-state concentrations are used to update the components of the system, but simulation time is not advanced. This rule permits us to treat protein-protein/galactose interactions as much faster reactions than protein synthesis/degradation (approximately instantaneous), while still allowing stochastic fluctuations to influence the deterministic solution trajectory in this multi-stable system (data not shown).

2. Stochastic part

In the second part of the algorithm, the processes of protein synthesis and degradation are modeled stochastically. The stoichiometric matrix was constructed from the following reactions:

\begin{matrix} G A L_{i} \overset{s_{i}}{\to} G A L_{i} + b \cdot G a l_{i'} \\ i \in {1, 2, 3, 4, 80, 13} \\ (protein synthesis, burst factor, b = 2 .5), \end{matrix}

\begin{matrix} G a l_{i} \overset{d_{i}}{\to} G a l_{i} - 1, \\ i \in {1, 2, 3_{i}, 3_{a}, 380, 4, 480, 80, 13_{i}, 13_{a}, 1380} \\ (protein degradation), \end{matrix}

for a total of 17 reaction channels involving 18 molecular species. Stochastic simulations were performed using the Gillespie algorithm [38]. Transcription and translation were modeled as a single step, where each mRNA molecule generates 2.5 protein molecules. The propensity function, s, for transcribing an mRNA molecule is

\begin{matrix} s_{i} = G A L_{i} \cdot f_{p_{i}} ([G a l 4, G a l 480]), \\ i \in {1, 2, 3, 4, 80, 13}, \end{matrix}

where $f_{P_{i}}$ is the promoter model assigned to gene GAL_i (see Promoter models section). Our model accounts for changes in gene copy-number because the probability of transcript production is proportional to the number of gene copies for that molecular species in the network. This is a result of the stochastic simulation algorithm, where multiplying the promoter function by gene number is equivalent to introducing additional identical reaction channels. The propensity function for protein degradation, d, is

\begin{matrix} d_{i} = G a l_{i} \cdot γ, \\ i \in {1, 2, 3, 3_{i}, 3_{a}, 380, 4, 480, 13_{i}, 13_{a}, 1380}, \\ γ = 0.002, \end{matrix}

where γ is the degradation rate.

The Gillespie algorithm samples the joint probability distribution of reaction events and times at each iteration of the stochastic part, selects a reaction, and generates a time interval containing no reactions, τ, which is used to advance the simulation time.

Promoter models

The propensity function of protein synthesis in the stochastic part of the simulation algorithm is driven by a physicochemical promoter model. An upstream activation sequence (UAS) on a promoter may be empty, occupied by a Gal4p molecule, or occupied by a Gal4-80p molecule. Sites occupied by Gal4p interact with and recruit RNA polymerase and the transcriptional machinery with an affinity that is intrinsic to the particular promoter (Table 4). Empty sites and sites occupied by Gal4-80p may also allow some basal transcription to occur. The promoter function, f_PI, for transcription is the sum of the contribution of all sites to transcription initiation in each occupancy configuration, weighted by the configuration's probability:

Table 4 GAL promoter models

Full size table

f_{P_{i}} ([G a l 4], [G a l 480]) = \frac{{\sum_{c} K_{c} [G a l 4]}^{n 4_{c}} {[G a l 480]}^{n 480_{c}} e^{\frac{1}{β} Δ G_{c}}}{\sum {[G a l 4]}^{n 4_{c}} {[G a l 480]}^{n 480_{c}} e^{\frac{1}{β} Δ G_{c}}}

Where c index the occupancy configurations for the promoter driving gene GAL_i . K_c is the sum of contributions to transcription initiation by all sites in configuration c. n 4 _c and n 480_c are the numbers of Gal4p- and Gal480p- bound sites in configuration c respectively. β (1.68 kcal/mol) is the product of the gas constant and the system's temperature (300 K). ΔG_c is the sum of the Gibbs' free energy changes from all DNA-protein binding events in configuration c, as well as any cooperative interactions between adjacently bound activator and/or repressor proteins which stabilize the DNA-protein complex. Taking the configuration where all sites are empty as the reference state of each promoter, we used -13.86 kcal/mol as the energy change for the binding of either a Gal4p or Gal4-80p to a UAS [39]. We chose -2.00 kcal/mol and -3.00 kcal/mol for energy changes due to a single cooperative interaction between appropriately positioned Gal4p and Gal480p respectively because they led to proper quantitative promoter function in S. cerevisiae, and we could not identify these parameters in the literature. In the absence of any UAS on a promoter, as in the case for the GAL4 promoter, the propensity function evaluates to a constant, Kc, giving constitutive expression.

Algorithm execution

The model was implemented and simulated in MATLAB^® (The MathWorks). Simulations were run for 8,000 seconds, which was found to be much longer than the typical time for equilibrium (data not shown). To decrease computation time - particularly in systems with high protein concentrations - the deterministic step was executed only every n iterations, where n scaled linearly with the total number of molecules participating in the deterministic step. The rationale behind this approximation is that stochastic synthesis or degradation of a few molecules will not significantly change the equilibrium of interacting proteins when large concentrations of proteins are already reacting. Trial simulations run with and without this approximation procedure followed similar induction dynamics, attained equilibrium at the same times, and had similar equilibrium behavior (data not shown).

Conversions between number of molecules and concentrations are performed at each step as required using 1.66 × 10^-15 and 23 × 10^-15L as nuclear and cellular volumes respectively [40, 41].

The mean and variance for the number of molecules for each species was obtained from the equilibrium distributions of 100 independent simulations at 8 extracellular galactose concentrations (10^-8, 10^-7, 10^-6, 10^-5, 10^-4, 10^-3, 10^-2, and 10^-1 M). This was done for each of the 33 GAL networks to construct their response curves to increasing galactose concentrations.

Scoring scheme for evolutionary paths

Each evolutionary event is scored according to its effect on some feature - for example, the number of molecules with galactokinase activity at a certain extracellular galactose concentration. We begin scoring evolutionary events after the genome duplication and consider the 120 unique combinations of the five events required for the post-genome duplication ancestor to evolve to S. cerevisiae. For each feature to be scored, we defined either the largest fold-increase or fold-decrease to be the optimal change (see Results section). Events are scored relative to the possible alternative events at each intermediate network along the path. An observed event is penalized if it does not result in the best possible change in the feature from amongst the possible events. To keep the scoring scheme consistent, if the feature is to be maximized, we used:

s = \sum_{i = 1}^{5} \log_{10} (\frac{x_{a = o, i}}{\max_{a} {x_{a_{j},i}}})

If the feature is to be minimized, the total path score is:

s = \sum_{i = 1}^{5} \log_{10} (\frac{\min_{a} {x_{a_{j}}}}{x_{a = o, i}})

where x_a,i, is the fold-change in the network feature resulting from the i-th evolutionary step after the genome duplication, for the a-th alternative network. We indicate the observed evolutionary choice in the path at step i, as a = o. See Figure 4a for an illustration of the scoring scheme.

The scoring scheme provides an objective means of ranking evolutionary paths in regulatory space that optimize some network feature. The path with the highest possible score (which is 0) consists of the order of events which results in the best possible sequence of changes in the scored feature. Conversely, the worst-scoring path follows none of the best alternatives (except for the last evolutionary event).

Parameter selection

The number of protein molecules for each molecular species were inferred or taken directly from various sources as indicated in Table 1. In addition to fold-induction data, the following gene expression relationships from promoter replacement studies in S. cerevisiae[21] were used:

\begin{matrix} P S c G A L 3 \to G A L 1 = \\ 2.12 \times P S c G A L 1 \to G A L 1, \\ uninduced conditions \end{matrix}

\begin{matrix} P S c G A L 3 \to G A L 1 = \\ 0.13 \times P S c G A L 1 \to G A L 1, \\ induced conditions \end{matrix}

\begin{matrix} P K l a c G A L 1 / 3 \to G A L 1 = \\ 9.95 \times P S c G A L 1 \to G A L 1, \\ uninduced conditions \end{matrix}

\begin{matrix} P K l a c G A L 1 / 3 \to G A L 1 = \\ 0.54 \times P S c G A L 1 \to G A L 1, \\ induced conditions \end{matrix}

PGALi→GAL 1 indicates a network where GAL 1 is driven by GALi's wild-type promoter, in a gal 1 - Δ background. PScGALi is the S. cerevisiae wild type promoter for GALi and PKlacGAL1/3 is the K. lactis wild type promoter for GAL 1/3, which is thought to be similar to the ancestral bifunctional gene promoter.

The relationship [Gal3p] ≈ 5[Gal80p] in both uninduced and induced conditions was also implemented [6]. These parameters were used to simulate all networks in the regulatory space that we have investigated - under the assumption that, while regulatory features might evolve, their mode of operation remains unchanged (see Discussion).

Parameter set perturbations

To investigate the robustness of our results to perturbations in the parameter set we executed 50 further simulations of each of the 120 possible evolutionary paths while enforcing a random, non-persistent, parameter change at each network intermediate, not including S. cerevisiae. Specifically, we allowed random perturbations in the following parameters and ranges:

1.
Kinetic parameters: Multiplicative factor in [0.1, 10] from 10 ^U(-1,1) .
2.
Degradation rates, burst factor: Multiplicative factor in [0.5, 2] from 2 ^U(-1,1) .
3.
Contributions to transcript production (promoter models): Multiplicative factor in [0.5, 2] from 2 ^U(-1,1) _.
4.
Binding energies (promoter models): Additive factor in [-1, 1] from U(-1,1).

where U is the uniform distribution.

These random parameter perturbations can be thought of as minor mutations in the network components which do not form part of the parsimonious set of evolutionary changes along the lineage to S. cerevisiae, but which may nevertheless have occurred and subsequently been lost.

Under parameter perturbations, evolutionary path score distributions had larger variance, often skewed toward worse scores, with occasional perturbations completely abolishing the switch- like behavior of the system (Additional File 1, Figures S1-S3). Most perturbations, however, did not have a large impact on perturbed path scores.

Specifically, we find that the number of evolutionary events separating the specialization of ScGAL1 and ScGAL3 again correlates negatively with repression strength scores (R² = 0.85, Additional File 1, Figure S4a). The number of events preceding the specialization of ScGAL1 correlates negatively with induction strength scores (R² = 0.64, Additional File 1, Figure S4b). A significant difference in repression strength scores was again found between evolutionary paths were GAL80 duplicate loss is the last event and all other paths (-0.40 +/- 0.23 versus -2.08 +/- 1.03, P < 0.01, Additional File 1, Figure S4c). Perturbed evolutionary paths tend to maximize switch effectiveness scores and repression strength scores together (R² = 0.73, data not shown). Finally, we report that, even with a perturbed parameter set, there exists no single path which optimizes all three aspects of quantitative network behavior (Additional File 1, Figure S4d).

References

Wyrick JJ, Young RA: Deciphering gene expression regulatory networks. Curr Opin Genet Dev 2002, 12: 130-136. 10.1016/S0959-437X(02)00277-0
Article CAS PubMed Google Scholar
Chua G, Robinson MD, Morris Q, Hughes TR: Transcriptional networks: reverse-engineering gene regulation on a global scale. Curr Opin Microbiol 2004, 7: 638-646. 10.1016/j.mib.2004.10.009
Article CAS PubMed Google Scholar
Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, Simon I, Zeitlinger J, Jennings EG, Murray HL, Gordon DB, Ren B, Wyrick JJ, Tagne JB, Volkert TL, Fraenkel E, Gifford DK, Young RA: Transcriptional regulatory networks in Saccharomyces cerevisiae . Science 2002, 298: 799-804. 10.1126/science.1075090
Article CAS PubMed Google Scholar
Teichmann SA, Babu MM: Gene regulatory network growth by duplication. Nat Genet 2004, 36: 492-496. 10.1038/ng1340
Article CAS PubMed Google Scholar
Lohr D, Venkov P, Zlatanova J: Transcriptional regulation in the yeast GAL gene family: a complex genetic network. FASEB J 1995, 9: 777-787.
CAS PubMed Google Scholar
Rubio-Texeira M: A comparative analysis of the GAL genetic switch between not-so distant cousins: Saccharomyces cerevisiae versus Kluyveromyces lactis . FEMS Yeast Res 2005, 5: 1115-1128. 10.1016/j.femsyr.2005.05.003
Article CAS PubMed Google Scholar
Traven A, Jelicic B, Sopta M: Yeast Gal4: a transcriptional paradigm revisited. EMBO Rep 2006, 7: 496-499. 10.1038/sj.embor.7400679
Article PubMed Central CAS PubMed Google Scholar
Ideker T, Thorsson V, Ranish JA, Christmas R, Buhler J, Eng JK, Bumgarner R, Goodlett DR, Aebersold R, Hood L: Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 2001, 292: 929-934. 10.1126/science.292.5518.929
Article CAS PubMed Google Scholar
Acar M, Becskei A, van Oudenaarden A: Enhancement of cellular memory by reducing stochastic transitions. Nature 2005, 435: 228-232. 10.1038/nature03524
Article CAS PubMed Google Scholar
Ruhela A, Verma M, Edwards JS, Bhat PJ, Bhartiya S, Venkatesh KV: Autoregulation of regulatory proteins is key for dynamic operation of GAL switch in Saccharomyces cerevisiae . FEBS Lett 2004, 576: 119-126. 10.1016/j.febslet.2004.09.001
Article CAS PubMed Google Scholar
Ramsey SA, Smith JJ, Orrell D, Marelli M, Petersen TW, deAtauri P, Bolouri H, Aitchison JD: Dual feedback loops in the GAL regulon suppress cellular heterogeneity in yeast. Nat Genet 2006, 38: 1082-1087. 10.1038/ng1869
Article CAS PubMed Google Scholar
Hittinger CT, Goncalves P, Sampaio JP, Dover J, Johnston M, Rokas A: Remarkably ancient balanced polymorphisms in a multi-locus gene network. Nature 2010, 464: 54-58. 10.1038/nature08791
Article PubMed Central CAS PubMed Google Scholar
Martchenko M, Levitin A, Hogues H, Nantel A, Whiteway M: Transcriptional rewiring of fungal galactose-metabolism circuitry. Curr Biol 2007, 17: 1007-1013. 10.1016/j.cub.2007.05.017
Article CAS PubMed Google Scholar
Slot JC, Rokas A: Multiple GAL pathway gene clusters evolved independently and by different mechanisms in fungi. Proc Natl Acad Sci USA 2010, 107: 10136-10141. 10.1073/pnas.0914418107
Article PubMed Central CAS PubMed Google Scholar
Hittinger CT, Rokas A, Carroll SB: Parallel inactivation of multiple GAL pathway genes and ecological diversification in yeasts. Proc Natl Acad Sci USA 2004, 101: 14144-14149. 10.1073/pnas.0404319101
Article PubMed Central CAS PubMed Google Scholar
Wolfe KH, Shields DC: Molecular evidence for an ancient duplication of the entire yeast genome. Nature 1997, 387: 708-713. 10.1038/42711
Article CAS PubMed Google Scholar
Kellis M, Birren BW, Lander ES: Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae . Nature 2004, 428: 617-624. 10.1038/nature02424
Article CAS PubMed Google Scholar
Meyer J, Walker-Jonah A, Hollenberg CP: Galactokinase encoded by GAL1 is a bifunctional protein required for induction of the GAL genes in Kluyveromyces lactis and is able to suppress the gal3 phenotype in Saccharomyces cerevisiae . Mol Cell Biol 1991, 11: 5454-5461.
Article PubMed Central CAS PubMed Google Scholar
Johnston M, Davis RW: Sequences that regulate the divergent GAL1-GAL10 promoter in Saccharomyces cerevisiae . Mol Cell Biol 1984, 4: 1440-1448.
Article PubMed Central CAS PubMed Google Scholar
Bajwa W, Torchia TE, Hopper JE: Yeast regulatory gene GAL3: carbon regulation; UASGal elements in common with GAL1, GAL2, GAL7, GAL10, GAL80, and MEL1; encoded protein strikingly similar to yeast and Escherichia coli galactokinases. Mol Cell Biol 1988, 8: 3439-3447.
Article PubMed Central CAS PubMed Google Scholar
Hittinger CT, Carroll SB: Gene duplication and the adaptive evolution of a classic genetic switch. Nature 2007, 449: 677-681. 10.1038/nature06151
Article CAS PubMed Google Scholar
Shea MA, Ackers GK: The OR control system of bacteriophage lambda. A physical chemical model for gene regulation. J Mol Biol 1985, 181: 211-230. 10.1016/0022-2836(85)90086-5
Article CAS PubMed Google Scholar
Buchler NE, Gerland U, Hwa T: On schemes of combinatorial transcription logic. Proc Natl Acad Sci USA 2003, 100: 5136-5141. 10.1073/pnas.0930314100
Article PubMed Central CAS PubMed Google Scholar
Bolouri H: Computational Modelling of Gene Regulatory Networks - A Primer. New Haven: Imperial College Press; 2008.
Book Google Scholar
Alfonsi A, Cances E, Turinici G, Ventura BD, Huisinga W: Exact simulation of hybrid stochastic and deterministic models for biochemical systems. ESAIM Proc 2005, 14: 113.
Google Scholar
Kiehl TR, Mattheyses RM, Simmons MK: Hybrid simulation of cellular behavior. Bioinformatics 2004, 20: 316-322. 10.1093/bioinformatics/btg409
Article CAS PubMed Google Scholar
Rudiger S, Shuai JW, Huisinga W, Nagaiah C, Warnecke G, Parker I, Falcke M: Hybrid stochastic and deterministic simulations of calcium blips. Biophys J 2007, 93: 1847-1857. 10.1529/biophysj.106.099879
Article PubMed Central CAS PubMed Google Scholar
Salis H, Kaznessis Y: Accurate hybrid stochastic simulation of a system of coupled chemical or biochemical reactions. J Chem Phys 2005, 122: 54103. 10.1063/1.1835951
Article PubMed Google Scholar
McAdams HH, Arkin A: Stochastic mechanisms in gene expression. Proc Natl Acad Sci USA 1997, 94: 814-819. 10.1073/pnas.94.3.814
Article PubMed Central CAS PubMed Google Scholar
Arkin A, Ross J, McAdams HH: Stochastic kinetic analysis of developmental pathway bifurcation in phage lambda-infected Escherichia coli cells. Genetics 1998, 149: 1633-1648.
PubMed Central CAS PubMed Google Scholar
Fitch WM: Toward defining the course of evolution: minimum change for a specific tree topology. Systematic Zoology 1971,20(4):406-416. 10.2307/2412116
Article Google Scholar
Dekel E, Alon U: Optimality and evolutionary tuning of the expression level of a protein. Nature 2005, 436: 588-592. 10.1038/nature03842
Article CAS PubMed Google Scholar
Ostergaard S, Olsson L, Nielsen J: In vivo dynamics of galactose metabolism in Saccharomyces cerevisiae : metabolic fluxes and metabolite levels. Biotechnol Bioeng 2001, 73: 412-425. 10.1002/bit.1075
Article CAS PubMed Google Scholar
de Jongh WA, Bro C, Ostergaard S, Regenberg B, Olsson L, Nielsen J: The roles of galactitol, galactose-1-phosphate, and phosphoglucomutase in galactose-induced toxicity in Saccharomyces cerevisiae. Biotechnol Bioeng 2008, 101: 317-326. 10.1002/bit.21890
Article CAS PubMed Google Scholar
Bram RJ, Lue NF, Kornberg RD: A GAL family of upstream activating sequences in yeast: roles in both induction and repression of transcription. EMBO J 1986, 5: 603-608.
PubMed Central CAS PubMed Google Scholar
Rokas A, Hittinger CT: Transcriptional rewiring: the proof is in the eating. Curr Biol 2007, 17: R626-628. 10.1016/j.cub.2007.06.025
Article CAS PubMed Google Scholar
Pannala VR, Bhartiya S, Venkatesh KV: Experimental and steady-state analysis of the GAL regulatory system in Kluyveromyces lactis . FEBS J 2010, 277: 2987-3002. 10.1111/j.1742-4658.2010.07708.x
Article CAS PubMed Google Scholar
Gillespie DT: Exact stochastic simulation of coupled chemical reactions. J Phys Chem 1977, 81: 2340-2361. 10.1021/j100540a008
Article CAS Google Scholar
Melcher K, Xu HE: Gal80-Gal80 interaction on adjacent Gal4p binding sites is required for complete GAL gene repression. EMBO J 2001, 20: 841-851. 10.1093/emboj/20.4.841
Article PubMed Central CAS PubMed Google Scholar
Anders A, Lilie H, Franke K, Kapp L, Stelling J, Gilles ED, Breunig KD: The galactose switch in Kluyveromyces lactis depends on nuclear competition between Gal4 and Gal1 for Gal80 binding. J Biol Chem 2006, 281: 29337-29348. 10.1074/jbc.M604271200
Article CAS PubMed Google Scholar
Winey M, Yarar D, Giddings TH, Mastronarde DN: Nuclear pore complex number and distribution throughout the Saccharomyces cerevisiae cell cycle by three-dimensional reconstruction from electron micrographs of nuclear envelopes. Mol Biol Cell 1997, 8: 2119-2132.
Article PubMed Central CAS PubMed Google Scholar
Giniger E, Ptashne M: Cooperative DNA binding of the yeast transcriptional activator GAL4. Proc Natl Acad Sci USA 1988, 85: 382-386. 10.1073/pnas.85.2.382
Article PubMed Central CAS PubMed Google Scholar

Download references

Acknowledgements

This work was supported by a research award from the Center for the Analysis of Genome Evolution and Function (CAGEF) to CJ and CFI infrastructure and NSERC discovery grants to AMM. We thank Alex N Nguyen Ba, Andy Lai, Dr. John Parkinson, and Dr. Chris Hittinger for comments on the manuscript. We acknowledge Dr. Chris Hittinger for access to unpublished data.

Author information

Authors and Affiliations

Departments of Cell & Systems Biology and Ecology & Evolutionary Biology, University of Toronto, 25 Willcocks Street, Toronto, ON, M5 S 3B2, Canada
Christos Josephides & Alan M Moses

Authors

Christos Josephides
View author publications
You can also search for this author in PubMed Google Scholar
Alan M Moses
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alan M Moses.

Additional information

Authors' contributions

AMM conceived of the project. AMM and CJ designed research and wrote the manuscript. CJ performed all research. Both authors have read and approved the final manuscript.

Electronic supplementary material

12918_2010_613_MOESM1_ESM.PDF

Additional file 1:Path score distribution comparisons and assessment of evolutionary paths under parameter perturbations. Box plot comparisons of the three evolutionary path score distributions between unperturbed and perturbed paths (Figures S1-3). Correlations between evolutionary path scores and classes of paths under parameter perturbations (Figure S4). (PDF 361 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Josephides, C., Moses, A.M. Modeling the evolution of a classic genetic switch. BMC Syst Biol 5, 24 (2011). https://doi.org/10.1186/1752-0509-5-24

Download citation

Received: 30 July 2010
Accepted: 05 February 2011
Published: 05 February 2011
DOI: https://doi.org/10.1186/1752-0509-5-24

Modeling the evolution of a classic genetic switch

Abstract

Background

Results

Conclusions

Background

Results

A hybrid stochastic-deterministic modeling framework for examining the evolution of the GAL regulatory network

Reconstructing the operation of an inferred ancestral network

The effect of an evolutionary change on the quantitative function of the GAL network depends on the network's evolutionary history

Quantitative assessment of the order of evolutionary changes reveals that there are evolutionary paths that optimize specific network features

Discussion

Analysis of evolutionary paths

Assumptions of the approach

Conclusions

Methods

Hybrid stochastic-deterministic simulation algorithm

1. Deterministic part

2. Stochastic part

Promoter models

Algorithm execution

Scoring scheme for evolutionary paths

Parameter selection

Parameter set perturbations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Authors' contributions

Electronic supplementary material

12918_2010_613_MOESM1_ESM.PDF

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Systems Biology

Contact us