Abstract
We explore whether the process of multimerization can be used as a means to regulate noise in the abundance of functional protein complexes. Additionally, we analyze how this process affects the mean level of these functional units, response time of a gene, and temporal correlation between the numbers of expressed proteins and of the functional multimers. We show that, although multimerization increases noise by reducing the mean number of functional complexes it can reduce noise in comparison with a monomer, when abundance of the functional proteins are comparable. Alternatively, reduction in noise occurs if both monomeric and multimeric forms of the protein are functional. Moreover, we find that multimerization either increases the response time to external signals or decreases the correlation between number of functional complexes and protein production kinetics. Finally, we show that the results are in agreement with recent genomewide assessments of celltocell variability in protein numbers and of multimerization in essential and nonessential genes in Escherichia coli, and that the effects of multimerization are tangible at the level of genetic circuits.
Introduction
Proteins regulate various cellular processes. There are several mechanisms responsible for regulating their numbers in cells, which act at various stages of protein production [14], activation [5], and degradation. A recent study has provided genomewide information on protein numbers in Escherichia coli along with their celltocell variability [6]. In total, 121 were classified as essential, while 894 were classified as nonessential. Addressing multimerization, it was found that 719 proteins are functional in a monomeric form, while 198 function in a dimeric form, 16 in trimeric, 47 in tetrameric, and the remaining in higherorder forms. Multimerization is likely to arise from the need for functionality, and such a need varies significantly between proteins. Some proteins are functional both in monomeric as well as in various multimer forms [7,8], while others are only functional in a specific form [9].
The process of multimerization, aside from being related to the functionality of the proteins, may also affect the dynamics of the processes that the proteins regulate. This is expected given that multimerization necessarily affects the mean numbers of functional proteins, the response times of the cell (e.g. to external signals), and the degree of correlation between RNA numbers and the corresponding functional protein complex numbers, i.e. the degree of control that transcription factors have on the protein complex numbers over time. These effects can be expected to propagate to the network level. For example, in genetic switches, where stochastic fluctuations in protein numbers determine, among other factors, the switching frequency [10,11], cooperative binding of the proteins enhances the range of conditions for which bistability is observed [12].
The dynamics of protein abundance depend on the transcriptional and translational dynamics as well as on the kinetics of degradation of RNA and proteins. Therefore, to assess the effects of multimerization on the dynamics of gene expression and of genetic circuits one needs to model the kinetics of these processes in detail. The RNA production rate of a gene is mainly controlled during the process of transcription initiation, at the promoter region (see [1] for a review). Recent in vivo measurements of the intervals between the production of individual transcripts [13,14] suggest that, under normal growth conditions, there are two to three significant ratelimiting steps at the initiation stage that, aside from determining the mean rate of production, also determine the degree of noise in the process of RNA production. In prokaryotes, these observations relate directly to protein copy numbers, which tend to follow closely those of RNA [15]. To account for the stochasticity and the rate limiting steps of the underlying steps in the process of gene expression, we use the delayed stochastic modeling strategy [16] to drive the dynamics of the models, as it allows the use of nonMarkovian dynamics to model the noninstantaneous processes underlying transcription and translation [17]. The parameters used in the models are extracted from live, singlecell, singlemolecule measurements [6,13,14].
Using the modeling and simulation techniques mentioned above, along with realistic parameter values, we investigate the consequences of multimerization on mean numbers and fluctuations and on the response time of functional protein complex numbers to external signals. Further, we investigate whether these effects have tangible consequences on the kinetics of a small genetic circuit. Finally, we interpret our results in the light of recent in vivo measurements of mean and variability of protein numbers in E. coli.
Methods
We use a stochastic model of gene expression [16] that describes transcription, translation, degradation of mRNA and proteins, and multimerization (binding and unbinding of proteins). The model is implemented using a delayed variant [17] of the Stochastic Simulation Algorithm (SSA) [18], which is similar to the original SSA, but allows arbitrary delays before the release of each of the products of a reaction. A reaction product X with a delay τ is represented by X(τ).
Model of gene expression and RNA and protein degradation
Transcription is modeled by:
where S stands for an available transcription start site (TSS) of a gene and M stands for the mRNA coded by that gene. In this reaction, τ_{S }accounts for the duration of the process of transcription, including the finding of a promoter region by an RNA polymerase, the formation of the closed complex at the transcription start site, the open complex formation, and finally, the promoter escape [19] and elongation. Of these, in general, the most ratelimiting steps are the processes of isomerization and open complex formation [1,2].
To model this multistep process, we set the reaction rate to infinity, which causes the reaction to occur the moment the reactants become available. Given this, the parameter τ_{S }fully determines the interval between consecutive productions of transcripts. In our implementation, each time this reaction occurs, a value of τ_{S }is drawn from a gamma distribution with mean of k_{M}^{1 }and coefficient of variation (variance over the mean) of . With proper parameter values, the gamma distribution well approximates recent live cell measurements of intervals between productions of consecutive RNAs in E. coli [13,20]. We fitted the measurements of time intervals in [20] with the threeexponential model proposed in that work, and with a gamma distribution. The latter results in (α_{M}, (α_{M }k_{M})^{1}) of (2.27183, 1070.57) and (2.51171, 565.956) for the low and medium induction levels, respectively. The gamma fits have slightly higher likelihood than the threeexponential ones, so the fit is better.
According to this model, the transcript is released at the same time as the promoter region becomes unoccupied. This approximation assumes that the elongation time is negligible, which relies on observation that the durations of the closed and the open complex formations (in the order of 10^{3 }s) [1,2,13,20] are much longer than elongation (in the order of 10^{1 }or 10^{2 }s) [21,22]. Moreover, in prokaryotes, translation is coupled to transcription [23], and can initiate as soon as the ribosome binding site region (RBS) of the RNA is formed (ShineDalgarno sequence) [24]. Consequently, the RNA is available for translation very soon after the RNA polymerase escapes the promoter region.
The RNA, once assembled, is subject to degradation, which we chose to model as a firstorder reaction (due to a lack of evidence of degradation mechanisms that depend on, for example, RNA abundance or sequence [25]):
where d_{M }^{1 }is the mean mRNA lifetime.
In this model, the degree of noise on the RNA production kinetics can be tuned by varying α_{M}. Setting α_{M }= 1 yields Poisson distributed mRNA numbers M ~ Poi(k_{M }d_{M}^{1}), while α_{M }> 1 and α_{M }< 1 result in sub and superPoissonian distributions of RNA numbers, respectively (both of which have been reported in E. coli [6,20]). We note that, for integer values of α_{M}, the parameter has a physical interpretation: namely, it represents a sequential process with α_{M }elementary steps, each of duration (α_{M }k_{M})^{1}, which is in accordance with a sequential process of transcription initiation [1]. However, the best fit is typically obtained for noninteger values of α_{M}, which do not have a simple physical interpretation. One possible explanation is that the steps have unequal durations or that a step has nonexponential duration (e.g. the open complex formation that involves structural changes of the DNA). Similarly, superPoissonian RNA dynamics [6] (α_{M }< 1) require the existence of some additional mechanisms, such as a twostate model of transcription [26].
Translation is modeled by Reaction 3, where k_{P }is the stochastic rate of translation initiation and M is the number of available RNA molecules [27].
where P is protein and τ_{P }is the time it takes for the protein to be folded and activated, after translation is complete.
In the simulations, τ_{P }was set to zero for simplicity, in models of single gene expression, this parameter only shifts the protein numbers in time. If this delay is taken to be a random variable, it also results in increased fluctuations of the protein numbers. For the longterm behavior the timeshift is irrelevant, and the estimations of the contribution to noise are considerable smaller than those from other sources [28]. We tested adding such noise (by setting τ_{P }to follow a normally distributed delay) and found no qualitative differences in our conclusions.
Finally, the degradation of proteins is modeled via Reaction 4, a first order process [6]. (The rate of protein degradation has been observed to be constant, and identical in different growth conditions [29].)
where d_{P}^{1 }is the mean protein lifetime.
Modeling the multimerization process
In the case of homomers we consider multiple levels of multimerization (e.g. monomers, dimers, trimers), while for heteromers we only consider secondorder multimers, i.e. heterodimers. Note that, in the case of heteromers, the production of each of the two monomers is driven by a different promoter, while in the case of homomers, we assume that there is only one promoter driving the expression.
Heterodimerization and the reverse of this process (which can occur by dissociation or degradation) is modeled by the following reactions:
where P_{1 }and P_{2 }represent the monomers that form the heterodimer P_{1,2}, when bound to one another. Reactions 5 and 6 model the association and disassociation of monomers, respectively, with a_{1,2 }being the rate of association and u_{1,2 }the rate of disassociation. Reactions 7 and 8 model the degradation of monomers P_{1 }and P_{2}, respectively, while in the dimeric form. We denote the number of proteins i in either monomeric (P_{i}) or dimeric form (P_{i,j}) by X_{i }= P_{i }+ P_{i,j}.
The process of production of homomers of order N is modeled as follows:
where P_{i×k }denotes , the kth order homomer of proteins P_{i}. Reactions 9 represent the association of an orderk homomer and an order(n  k) homomer to form an ordern homomer, while Reactions 10 represent the reverse process. Reactions 11 represents the degradation of any of the n proteins that are part of an ordern homomer, resulting in an order(n  1) homomer. The rates a_{i×k,i×}_{(}_{nk}_{)}and u_{i×k,i}_{×}_{(}_{nk}_{)}, are the association and disassociation rates for the combinations of different order homomers, and is the protein degradation rate. We define , as the total number of proteins in the system, regardless of their form. This definition is analogous to that of X_{1 }and X_{2 }in the heterodimer model.
Toggle switch
We model a genetic toggle switch [30], which consists of two genes, expressing proteins P_{1 }and P_{2}, respectively. The protein expressed by the first gene inhibits the expression of the second gene, whose protein product in turn inhibits the expression of the first gene. Interactions between repressor proteins and promoters are implemented by assigning the rate k_{M }in Reaction 1 to be a function of the number of repressor molecules present in the system, as follows:
where P_{i×n }is the ordern multimer of gene i, K_{i,j }is the disassociation constant for the multimer binding to the promoter of gene j, and and are the maximal and effective transcription rates of the jth gene, respectively. (Here (i, j) = (1, 2) or (i, j) = (2, 1).)
Results
All models and simulations were performed using the simulator SGNS2 [31]. The following description of parameter selection applies to all simulations, unless otherwise mentioned. The protein degradation rate d_{P }is set to unity. This reduces the dimension of the parameter space: rate constants and time delays are expressed in units of protein lifetime. The parameters d_{M}, k_{M}, and k_{P }are varied logarithmically within the range [10^{−1}, 10^{1}], α_{M }is varied in the set {1, 2, 3, 5, 10}. Each of the parameters is varied independently. Variation in the parameter values within these ranges leads to significant variation in protein abundance (e.g. a range of 10^{6 }in the mean protein level).
To quantify changes in mean and noise levels when comparing models, we define "gain" as the ratio of the value of the tested model to that of the null model. Gains above unity imply that the tested model exhibits values larger than the null model, while gains less than unity imply the opposite.
For simplicity, the multimerization association rates a_{i×n,i}_{×(}_{nk}_{)} are assumed to be infinite, while the disassociation rates u_{i×n,i×}_{(}_{nk}_{)} are set to zero, this does not affect our results qualitatively, and facilitates comparison between models. This issue is further discussed in the results section. Finally, we run each simulation for 10^{5 }time units so that the system spends most of the time near equilibrium. We sample the state of the system (all molecules numbers) with intervals of one time unit.
Note that we include the transient in the samples as we sample from time zero. This is due to the fact that the system does not reach an equilibrium in a finite time interval. From observations of the time series we found that, for a duration of 10^{5 }time units, the systems is, for more than 99.9% of time, close to equilibrium. That is, given 10^{3 }simulations, if one extracts samples of multimer numbers from this region, one cannot distinguish them, in a statistical sense, from the samples of the distribution of multimer numbers at the last time moment.
Homodimers
We compared the mean levels of a monomeric protein (X_{1}) and of a homogeneous dimer (P_{1,1}). The two models are taken to be identical except for the dimerization. Since the expression rates are identical, the mean level of the dimer must be less than or equal to half the mean level of monomer. We consider two cases for the dimer model: one in which only the dimer is functional, and one in which both the monomer and dimer are functional. In the latter case, we asses the joint dynamics of both the monomeric (P_{1}) and dimeric (P_{1,1}) forms. In this case, the amount of functional proteins is given by .
We simulated the models with several parameters values of d_{M}, α_{M}, k_{M}, and k_{P }as described above. Taking μ as the mean level of the molecules of interest and as the mean level in the monomeric model, the ratio is plotted as a function of in Figure 1. (The mean is determined by d_{M}, k_{M}, and k_{P}).
Figure 1. Change in mean levels due to homodimerizationRelative mean levels of homodimers P_{1,1 }and the total number of molecules as a function of the mean monomer level . The dashed line indicates a gain of one half. The inset shows linear gain.
From Figure 1, we observe that for high values of , the mean level of the homodimers P_{1,1}, is half that of X_{1}, while for low values of it approaches zero, because it is more probable that there is a single protein in the system, precluding the formation of a dimer. The total number of monomers and dimers (Y_{1}) varies in an inverse fashion to that of dimers alone, since Y_{1 }= X_{1 } P_{1,1 }(cf. inset in Figure 1).
The points in Figure 1, while each being resultant from a unique set of parameter values, are grouped into bands. This is due to the fact that various combinations of parameter values result in identical mean levels but differing noise levels. The changes due to varying individual parameter values can be explained as follows. The expected mean protein level is determined by k_{M }d_{M}^{−1 }k_{P }d_{P}^{−1}, while the noise is increased with the inverse of the mean and the inverse of α_{M}, in an intricate manner (see [32] for an approximation). It follows that increasing (decreasing) k_{M }or k_{P }or decreasing (increasing) d_{P }will result in an increase (decrease) in the protein mean level (xaxis) and a consequent increase (decrease) in the mean gain (yaxis), and increasing (decreasing) α_{M }will have no effect on the protein mean (xaxis) and will decrease (increase) the mean gain (yaxis).
Next, using the same models, we compared the noise levels, quantified by the square of the coefficient of variation, denoted by η. The results are shown in Figure 2. When comparing with Figure 1, it is important to note that, in general, models with low and high noise levels correspond to the models with high and low mean levels, respectively. This relationship holds for low mean levels, for which the lowcopy number noise dominates, whereas for high mean levels other parameters dominate the noise. The points corresponding to simulations with identical mean levels of X_{1 }are again contiguous, but they do not form vertical lines.
Figure 2. Change in noise levels due to homodimerization. Relative noise levels of homodimers P_{1,1 }(upper panel) and the total number of molecules (lower panel) as a function of the noise level of monomers .
Given the properties of the model, the noise in the homodimer numbers (P_{1,1}) is always greater than that of the nondimerizing proteins (X_{1}). For parametrizations which yield a dimer level equal half of the number of protein units in the system (i.e. right hand side of Figure 1) the noise gain is equal to unity, implying no increase in noise due to the dimerization process. However, further decreases in the mean levels lead to significant increase in the noise (gains of the order of 10^{2}, as seen in the upper panel of Figure 2).
The lower panel of Figure 2 indicates that the noise in the total number of molecules (Y_{1}) is always smaller than that of the monomers. In the two extremes, the total number consists entirely of monomeric or dimeric forms of the protein, so the noise level of the functional proteins must match that of a single form. However, when the numbers of the monomeric and dimeric form are balanced, the noise level of the functional molecules (Y_{1}) is slightly suppressed by the dimerization when compared to X_{1}.
It is possible to see that the choice of multimerization association rates a_{i×n,i×}_{(}_{n−k}_{)}and disassociation rates u_{i×n,i×}_{(}_{n−k}_{)} does not affect the above results qualitatively. Any other settings will inevitably lower the number of dimers. Thus, the conclusion that dimer numbers must be lower than half the number of monomers holds. Also, the number of monomers and dimers (Y_{1}) will increase, since they will still follow the relationship Y_{1 }= X_{1 }− P_{1,1}. Additionally, the noise in dimer numbers will increase due to the lowcopy number effect, and consequently, the conclusion that the noise must be above unity holds. Finally, the noise in the numbers of monomers and dimers will become more similar to that of the monomers alone (resulting in a noise gain closer to unity).
Heterodimers
Next, we consider a scenario in which a dimer is formed by the protein products P_{1 }and P_{2 }of two distinct genes. For simplicity, the kinetics of protein production are assumed to be identical for P_{1 }and P_{2}. We compared the behaviour of this heterodimer with a corresponding homodimer model. (Alternatively, one could consider a model in which a single promoter controls the expression of P_{1 }and P_{2}. We opted not to investigate this case, because the kinetics would lie somewhere between the homodimer case and the heterodimer case described above). For the purposes of comparison, the mRNA production rate k_{M }is doubled in the homodimer, to compensate for the existence of two genes (each expressing at rate of k_{M}) producing the components of the heterodimer.
The ratio of the mean levels of the heterodimer and homodimer is plotted in Figure 3 as a function of the mean level of one of the proteins (X_{1}, or equivalently X_{2}). As in the homodimer case, when the mean levels is high, nearly all proteins are present in dimeric form, and so both models have the same mean abundance of functional protein, whereas for low means, there is a population of unpaired proteins which results in a reduction of the mean level of the dimer when compared to the nondimerizing gene. Moreover, the heterodimer case exhibits greater reductions in the mean than the homodimer case, since to form a dimer, the "missing" protein has to be of a certain type.
Figure 3. Change in mean levels due to heterodimerization. Relative mean levels of heterodimers P_{1,2 }and homodimers P_{1,1 }(with double k_{M }to compensate for the reduction in the mean level) as a function of the mean monomer level . The inset shows linear gain.
We also studied the ratio of the noise levels of the above models, as presented in Figure 4. The noise levels exhibit a behavior similar to the homodimer case presented previously (cf. Figure 2), but since in the present case the mean level is not halved (due to the increased transcription rate k_{M}) the noise gain can be decreased below unity. Specifically, for high mean levels in the homodimer, the noise is suppressed to one half, essentially due to the doubled transcription rate, in this case the dimerization process does not introduce much noise (noise gain equals unity, Figure 2). On the other hand, for low mean levels, the results follow those presented earlier. That is, the greater decrease of mean numbers results in higher gain in noise levels. Moreover, we find that the noise suppression ability of the heterodimeric form is less than that of the homodimer, due to the weaker temporal correlation between the numbers of the two distinct dimerforming proteins.
Figure 4. Change in noise levels due to heterodimerization. Relative noise levels of heterodimers P_{1,2 }and homodimers P_{1,1 }(with double k_{M }to compensate for the reduction in the mean level) as a function of the noise level of monomers X_{1}. The dashed line indicates a gain of unity.
Higherorder multimers
Finally, we studied if and how the results generalize for higherorder multimers. Since the effects were more prominent in homodimers, we studied only multimers of homogeneous proteins of increasing order. We present results for homomers of orders N ∈ {2, 3, 4, 5} (dimer, trimer, tetramer, and pentamer, respectively). We also tested for decamers (N = 10) (data not shown) as an extreme case, and found the qualitative results to agree with those presented here.
Analogous to the homodimeric case (Figure 1), the mean levels of orderN homomers are subject to gains of at most N^{−1}. In addition, as N increases, there is an ever increasing probability of lacking the necessary components to form the multimers, so the values of gain are generally lower than N^{−1}. We note that even models with mean levels of proteins in the order 10^{3 }are subject to significant losses in the multimerization procedure as the order is increased (e.g. N = 5). Likewise, the noise gain follows the trend shown earlier in Figure 2, with higherorder homomers being more noisy.
In Figure 5, we show the noise in homomerization in the case where all protein forms (P_{1}, ⋯, P_{1×N}) are functional. Here, the results agree with the dimer case (see the lower panel of Figure 2). Higherorder multimerization can exhibit greater noise suppression capabilities, but only for a more limited range of parameter values that lead to properly balanced numbers of the multimers in the various forms.
Figure 5. Change in noise levels of total number of molecules due to higherorder multimerization. Relative noise levels of total number of molecules with multimerization of different orders as a function of the noise level of monomers X_{1}.
We also compared noise levels of strictly monomeric proteins to those of multimers. For this comparison, the transcription rate of the proteins composing the multimers are chosen so that the mean numbers of the multimer form are similar to those of the strict monomer. The results (Figure 6) are similar to the homodimer case (Figure 4). Potentially, this scheme allows the noise level to be suppressed to N^{−1}th of the original value, but this is only achievable for highly expressed genes. In general, higherorder multimerization can only lead to noise suppression within a limited range of parameter values. More specifically, in the case of high order multimers, the fluctuations in protein numbers alone determines if the noise in multimer numbers is amplified or suppressed.
Figure 6. Change in noise levels of homomers due to higherorder multimerization. Relative noise levels of homomers (with adjusted k_{M }to compensate for the reduction in the mean level) with multimerization of different orders as a function of the noise level of monomers . The dashed lines indicate gains of unity, one half, one third, one quarter, and one fifth.
Temporal regulation of the number of multimers
In organisms such as bacteria, regulation of gene expression is performed mostly at the stage of transcription initiation, at the promoter region. Consequently, temporal variability in monomer levels is strongly controlled by factors regulating transcription initiation. However, the production of multimeric proteins involves an additional stochastic process  multimer formation itself. As a result, one expects that a mechanism operating at the stage of transcription initiation may exhibit reduced control over the temporal numbers of multimer, as compared with proteins that function as monomers. This may pose limits on the selection of higherorder multimers.
We studied how the process of multimerization affects the ability to regulate multimer numbers via the regulation of the kinetics of production of the monomers alone. We hypothesize that the optimal design would have the multimer numbers following the monomer numbers as closely as possible. That is, the crosscorrelation between the numbers of monomers and multimers should be unity at zerolag, the lag referring to the timeshift in the series of the two numbers for which the correlation is evaluated. This crosscorrelation should also decay as quickly as possible with lag, because otherwise the correlation with past events would make it difficult for the system to respond to current changes.
We found that, in general, the crosscorrelation functions estimated from our simulations exhibited maximal correlation at zerolag. We thus use the crosscorrelation at zerolag to quantify the loss in control due to the multimerization process. To study the decay of the crosscorrelation in each model, we estimate the point in lag where the crosscorrelation attains a value that is half of the maximum, denoted by halflife of the proteinhomomer crosscorrelation. We note that, for an exponential decay of correlation, this halflife would equal ln 2 times the mean response time. However, since the decays measured are not purely exponential, but rather combinations of several exponentially decaying terms, the halflife only reflects the response times in a qualitative sense.
To assess these quantities, we sampled the state of the models with intervals of 1/10 of one time unit, and ran the simulations to obtain 10^{5 }samples. For each multimer order, we compared the halflife of the proteinhomomer crosscorrelation with the crosscorrelation at zerolag (Figure 7). The results indicate that for higher orders of multimerization, there is a loss in correlation in the homomers, when the value of the correlation was high. The results indicate that as the order of multimerization increases, the correlation at zero lag of the homomers decreases. This is only significant if these homomers had high crosscorrelation to begin with. Moreover, in general, high correlations imply higher halflives regardless of the order of the multimer, which indicates that the multimers cannot exhibit high control and fast regulation at the same time. Also generally, for multimers with a specific value of correlation at zero lag, lower order multimers will have shorter response times.
Figure 7. Crosscorrelation between homomers and monomers. Halflife (in units of protein lifetime) of crosscorrelation between homomer and monomer levels as a function of the zerolag (maximum) crosscorrelation between the two.
Genome wide assessment of celltocell variability and degree of multimerization in Escherichia coli
In [6], genomewide data was collected on the mean and standard deviation of protein copy numbers in populations of E. coli under optimal growth conditions, for large sets of both essential and nonessential genes. (Essentiality of a gene is defined according to the following criteria (http://www.shigen.nig.ac.jp/ecoli/pec/index.jsp webcite): in general, genes for which lethal mutants have been isolated are classified as essential.) From the EcoCyc database (http://www.ecocyc.org/ webcite) for the strain E. coli K12 MG1655, we assessed which of these proteins form multimers and, if so, how many subunits of each gene is involved in the multimer.
Table 1 presents the fraction of proteins that form each of the various orders of multimers, for both essential and nonessential genes. Also, for each order we computed the median (med μ) of the mean protein numbers and the median of the squared coefficient of variation (med η) of protein numbers in individual cells. Note that the mean and noise levels are extracted from observation of individuals proteins alone, rather than proteins in multimeric form.
Table 1. Mean and noise in bacterial genes as a function of multimerization noise.
In general, essential genes exhibit higher mean levels than nonessential ones. Also, their noise levels appear to be somewhat constant [6]. Further, proteins from essential genes appear to form higherorder functional units, and the mean levels of proteins forming highorder multimers are much higher. Nonessential genes also exhibit higher mean levels of protein numbers when forming highorder multimers.
In [6], it is also suggested that the protein numbers of essential genes lie on a noise floor. This floor was hypothesized to originate from fluctuations in cellular components (e.g. metabolites, polymerases, ribosomes) [6]. Our results above suggest that multimerization should offer a means to reduce the copy number noise level of proteins in the functional form below this noise floor. If so, one would expect the protein products of essential genes to have a greater tendency for multimerization than nonessential ones, since they already lie on the noise floor in the monomeric form while the latter should be able to select for reduced noise by other means, such as tuning the noise in the process of RNA production. The data in the EcoCyc database agrees with this prediction. Further, for this strategy of noise reduction in essential genes to be successful, one would expect to observe also much higher mean protein numbers in the case of highly multimerizing genes. This is also confirmed by the data in Table 1.
Toggle switch
Finally, we tested if multimerization can affect the stochastic behavior of genetic circuits. To this end, we simulated models of genetic toggle switches [30], using homomers of different order (N ∈ {1, 2, 3}) as regulatory molecules. We then measured the mean switching times of each model, that is, the average time the switch spends on one of the two states (either P_{1×N }< P_{2×N }or P_{1×N }> P_{2×N}). To account for the fact that the mean switching time is sensitive to the mean protein levels, the dimer and the trimer were simulated with double and triple k_{M}, respectively (to provide similar mean level of the regulatory molecules in the different models).
The parameters used in the models were: RNA degradation rate d_{M }= 6 d_{P}, expected transcript number k_{M }N^{−1}d_{M}^{−1 }= 5, transcription kinetics shape α_{M }= 1, expected number of protein per RNA k_{P }d_{P}^{−1 }= 5, and disassociation of repression K = C k_{M }N^{−1 }d_{M }^{−1 }k_{P }d_{P}^{−1}, where C was varied in the range [10^{−4}, 10^{4}] with approximately logarithmic spacing (i.e. {a 10^{b} a ∈ {1, 2, 3, ⋯ , 9}, b ∈ {−4, −3, −2, ⋯ , 4}}). The gene expression parameters are in agreement with live cell measurements in E. coli [6]. The switch's state was sampled with intervals of 1/30 time units, the simulations provided 10^{6 }samples. The mean switching time as a function of the inverse of the repression strength C is shown in Figure 8.
Figure 8. Change in mean switching time of toggle switch due to multimerization. Mean switching time of a toggle switch as a function of the inverse repression strength C, where the genetic interactions are implement with different orders of homomers, as a function of disassociation strength of repression.
In Figure 8 it is visible that the mean switching time is different for different orders of homomerization. In the region where the repression is strong, the multimerization results in increased switching times. On the other hand, for low repression strength, the mean switching time is decreased for the homomers. In general, higherorder multimerization appears to offer a wider range of mean switching times for the toggle switch. These differences in the kinetics of the models are due to the differences in noise levels of the functional multimers of different orders, confirming thus that the order of multimerization has a tangible effect on the kinetics of genetic circuits.
Conclusion
We studied how the order and nature of the multimerization of a protein affects the temporal variability in copy number. We found that multimerization increases noise, in that it necessarily reduces the numbers of functional protein complexes. However, if both monomers and dimers (or higherorder multimers) are functional, the dimerization process suppresses noise in the numbers of functional complexes, for a range of parameter values for which dimers and monomers are present in similar amounts. Alternatively, if the introduction of a multimerization process is combined with an increase of transcription rates to compensate for the decrease in number of functional complexes, then multimerization can also lead to a reduction of noise levels on the numbers of these functional complexes. The same holds true for heterodimers, but the noise suppression is less significant because the production of the subunits is less coordinated.
In addition, multimerization reduces the degree of control exerted by gene regulatory mechanisms on the copy number of functional complexes. Compensatory increases in this control, which are constrained by the noise introduced by the multimerization process, will necessarily lead to an increase on the mean response time of the gene.
Finally, the stochastic effects of multimerization were found to propagate to the level of genetic circuits, further supporting the notion that this process is likely under selection pressure for reasons other than functionality of proteins: namely, for their effects on the dynamics of protein numbers and on the dynamics of genetic circuits. This selective pressure may be confirmed by future studies, but the observation that essential genes (whose numbers of the proteins in monomeric form alone lie on the noise floor) are more likely to multimerize than nonessential ones, is already tentative evidence for the existence of this pressure.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
ASR, BI, and AH conceived the study. ASR supervised the interpretation of data. AH and HT performed the modeling and analysis. All authors performed research. AH and ASR drafted the manuscript. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Acknowledgements
Work supported by the Academy of Finland, the Finnish Funding Agency for Technology and Innovation, and the Science Foundation of Tampere City (Finland). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Declarations
The publication costs for this article were funded by the Finnish Funding Agency for Technology and Innovation (grant no. 40226/12).
This article has been published as part of BMC Systems Biology Volume 7 Supplement 1, 2013: Selected articles from the 10th International Workshop on Computational Systems Biology (WCSB) 2013: Systems Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/7/S1.
References

McClure WR: Mechanism and control of transcription initiation in prokaryotes.
Annu Rev Biochem 1985, 54:171204. PubMed Abstract  Publisher Full Text

Lutz R, Lozinski T, Ellinger T, Bujard H: Dissecting the functional program of Escherichia coli promoters: The combined mode of action of Lac repressor and AraC activator.
Nucleic Acids Res 2001, 29(18):38733881. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Landick R: The regulatory roles and mechanism of transcriptional pausing.
Biochem Soc Trans 2006, 34(6):10621066. PubMed Abstract  Publisher Full Text

Rajala T, Hakkinen A, Healy S, YliHarja O, Ribeiro AS: Effects of transcriptional pausing on gene expression dynamics.
PLoS Comput Biol 2010, 6(3):e1000704. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Holberg CI, Tran SEF, Eriksson JE, Sistonen L: Multisite phosphorylation provides sophisticated regulation of transcription factors.
Trends Biochem Sci 2002, 27(12):619627. PubMed Abstract  Publisher Full Text

Taniguchi Y, Choi PJ, Li GW, Chen H, Babu M, Hearn J, Emili A, Xie XS: Quantifying E. coli proteome and transcriptome with singlemolecule sensitivity in single cells.
Science 2010, 329(5991):533538. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Ollivierre JN, Sikora JL, Beuning PJ: The dimeric SOS mutagenesis protein UmuD is active as a monomer.
J Biol Chem 2011, 286(5):36073617. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Escher A, O'Kane DJ, Lee J, Szalay AA: Bacterial luciferase alpha beta fusion protein is fully active as a monomer and highly sensitive in vivo to elevated temperature.
Proc Natl Acad Sci USA 1989, 89(17):65286532. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

D'Autreaux B, Pecqueur L, Gonzalez de Peredo A, Diederix REM, CauxThang C, Tabet L, Bersch B, Forest E, MichaudSoret I: Reversible Redox and Zincdependent dimerization of the Escherichia coli Fur protein.
Biochemistry 2007, 46(5):13291342. PubMed Abstract  Publisher Full Text

Ribeiro AS: Effects of coupling strength and space on the dynamics of coupled toggle switches in stochastic gene networks with multipledelayed reactions.
Phys Rev E 2007, 75(6):061903. PubMed Abstract

Ribeiro AS: Dynamics of a twodimensional model of cell tissues with coupled stochastic gene networks.
Phys Rev E 2007, 76(5):051915. PubMed Abstract

Lipshtat A, Loinger A, Balaban NQ, Biham O: Genetic toggle switch without cooperative binding.
Phys Rev Lett 2006, 96(18):188101. PubMed Abstract  Publisher Full Text

Kandhavelu M, Mannerstrom H, Gupta A, Hakkinen A, LloydPrice J, YliHarja O, Ribeiro AS: In vivo kinetics of transcription initiation of the lar promoter in Escherichia coli: Evidence for a sequential mechanism with two ratelimiting steps.
BMC Syst Biol 2011, 5:149. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Muthukrishnan AB, Kandhavelu M, LloydPrice J, Kudasov F, Chowdhury S, YliHarja O, Ribeiro AS: Dynamics of transcription driven by the tetA promoter, one event at a tive, in live Escherichia coli cells.
Nucleic Acids Res 2012, 40(17):84728483. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Kaern M, Elston TC, Blake WJ, Collins JJ: Stochasticity in gene expression: From theories to phenotypes.
Nat Rev Genet 2005, 6(6):451464. PubMed Abstract  Publisher Full Text

Ribeiro AS, Zhu R, Kauffman SA: A general modeling strategy for gene regulatory networks with stochastic dynamics.
J Comp Biol 2006, 13(9):16301639. PubMed Abstract  Publisher Full Text

Roussel MR, Zhu R: Validation of an algorithm for delay stochastic simulation of transcription and translation in prokaryotic gene expression title.
Phys Biol 2006, 3(4):274284. PubMed Abstract  Publisher Full Text

Gillespie DT: Exact stochastic simulation of coupled chemical reactions.
J Phys Chem 1977, 81(25):23402361. Publisher Full Text

deHaseth PL, Zupancic ML, Record MT Jr: RNA polymerasepromoter interactions: the comings and goings of RNA polymerase.
J Bacteriol 1998, 180(12):30193025. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Kandhavelu M, Hakkinen A, YliHarja O, Ribeiro AS: Singlemolecule dynamics of transcription of the lar promoter.
Phys Biol 2012, 9(2):026004. PubMed Abstract  Publisher Full Text

Herbert KM, La Porta A, Wong BJ, Mooney RA, Neuman KC, Landick R, Block SM: Sequenceresolved detection of pausing by single RNA polymerase molecules.
Cell 2006, 125(6):10831094. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Golding I, Paulsson J, Zawilski SM, Cox EC: Realtime kinetics of gene activity in individual bacteria.
Cell 2005, 123(6):10251036. PubMed Abstract  Publisher Full Text

Miller OL Jr, Hamkalo BA, Thomas CA Jr: Visualization of bacterial genes in action.
Science 1970, 169(3943):392395. PubMed Abstract  Publisher Full Text

Shine J, Dalgarno L: Determinant of cistron specificity in bacterial ribosomes.
Nature 1975, 254(5495):3438. PubMed Abstract  Publisher Full Text

Bernstein JA, Khodursky AB, Lin PH, LinChao S, Cohen SN: Global analysis of mRNA decay and abundance in Escherichia coli at singlegene resolution using twocolor fluorescent DNA microarrays.
Proc Natl Acad Sci USA 2002, 99(15):96979702. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Peccoud J, Ycart B: Markovian modelling of gene product synthesis.
Theor Popul Biol 1995, 48(2):222234. Publisher Full Text

Yu J, Xiao J, Ren X, Lao K, Xie XS: Probing gene expression in live cells, one protein molecule at a time.
Science 2006, 311(5767):16001603. PubMed Abstract  Publisher Full Text

Makela J, LloydPrice J, YliHarja O, Ribeiro AS: Stochastic sequencelevel model of coupled transcription and translation in prokaryotes.
BMC Bioinf 2011, 12:121. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Nath K, Koch A: Protein degradation in Escherichia coli.
J Biol Chem 1971, 246:69566967. PubMed Abstract  Publisher Full Text

Gardner TS, Cantor CR, Collins JJ: Construction of a genetic toggle switch in Escherichia coli.
Nature 2000, 403(6767):339342. PubMed Abstract  Publisher Full Text

LloydPrice J, Gupta A, Ribeiro AS: SGNS2: A compartmentalized stochastic chemical kinetics simulator for dynamic cell populations.
Bioinf 2012, 28(22):30043005. Publisher Full Text

Pedraza JM, Paulsson J: Effects of molecular memory and bursting on fluctuations in gene expression.
Science 2008, 319(5861):339343. PubMed Abstract  Publisher Full Text