Telethon Institute of Genetics and Medicine (TIGEM), Naples, Italy

Department of Engineering Mathematics, University of Bristol, Bristol, UK

Department of Computer and Systems Engineering, University of Naples Federico II, Naples, Italy

Abstract

Background

RNA interference (RNAi) is a regulatory cellular process that controls post-transcriptional gene silencing. During RNAi double-stranded RNA (dsRNA) induces sequence-specific degradation of homologous mRNA via the generation of smaller dsRNA oligomers of length between 21-23nt (siRNAs). siRNAs are then loaded onto the RNA-Induced Silencing multiprotein Complex (RISC), which uses the siRNA antisense strand to specifically recognize mRNA species which exhibit a complementary sequence. Once the siRNA loaded-RISC binds the target mRNA, the mRNA is cleaved and degraded, and the siRNA loaded-RISC can degrade additional mRNA molecules. Despite the widespread use of siRNAs for gene silencing, and the importance of dosage for its efficiency and to avoid off target effects, none of the numerous mathematical models proposed in literature was validated to quantitatively capture the effects of RNAi on the target mRNA degradation for different concentrations of siRNAs. Here, we address this pressing open problem performing in vitro experiments of RNAi in mammalian cells and testing and comparing different mathematical models fitting experimental data to in-silico generated data. We performed in vitro experiments in human and hamster cell lines constitutively expressing respectively EGFP protein or tTA protein, measuring both mRNA levels, by quantitative Real-Time PCR, and protein levels, by FACS analysis, for a large range of concentrations of siRNA oligomers.

Results

We tested and validated four different mathematical models of RNA interference by quantitatively fitting models' parameters to best capture the in vitro experimental data. We show that a simple Hill kinetic model is the most efficient way to model RNA interference. Our experimental and modeling findings clearly show that the RNAi-mediated degradation of mRNA is subject to saturation effects.

Conclusions

Our model has a simple mathematical form, amenable to analytical investigations and a small set of parameters with an intuitive physical meaning, that makes it a unique and reliable mathematical tool. The findings here presented will be a useful instrument for better understanding RNAi biology and as modelling tool in Systems and Synthetic Biology.

Background

RNA interference (RNAi) is a well characterized regulatory mechanism in eukaryotes

Despite its widespread experimental application, the best way to quantitatively model RNA interference is still under debate. In systems and synthetic biology, mathematical models are essential to carry out in silico investigations of biological pathways, or novel synthetic circuits. The aim of this work is to find the most appropriate quantitative mathematical model that can correctly describe the RNAi phenomenon in mammalian cells, for varying concentrations of the siRNA oligomers.

A schematic representation of the RNA interference mechanism is illustrated in Figure

Schematic representation of RNA interference in a mammalian cell

**Schematic representation of RNA interference in a mammalian cell**. Step 1: double stranded RNA (dsRNA) elicits a response in the cell mediated by the enzyme Dicer, which cleaves the dsRNA into fragments of 21-23 base pairs (siRNA). Step 2: siRNAs are loaded into a multiprotein complex called RNA Induced Silencing Complex (RISC) and one strand (the passenger strand) is discarded and degraded

Results and Discussion

In order to model the effects of RNA interference on mRNA expression levels at different concentration of siRNA oligomers, we carried out in-vivo experiments of RNA interference on two mammalian cell-lines stably expressing the EGFP protein or the tTA protein, respectively.

In the first set of experiments (set I), Human Embryonic Kidney cells stably expressing EGFP (HEK293-EGFP cell-line), were transfected with varying quantities of synthetic siRNA oligomers directed against the

Ratio of

**Ratio of EGFP mRNA levels between cells transfected with the siRNAs specific for EGFP, and negative control cells, transfected with a non-specific siRNAs, measured 48 hours after transfection**. Errorbars represent the standard-error from three biological replicates for each point. The x-axis reports the different quantities of siRNA oligomers tested. mRNA levels were measured using real-time PCR. The error-bars have the length of one standard error.

Ratio of EGFP protein levels between cells transfected with the siRNAs specific for EGFP, and negative control cells, transfected with a non-specific siRNAs, measured 60 hours after transfection

**Ratio of EGFP protein levels between cells transfected with the siRNAs specific for EGFP, and negative control cells, transfected with a non-specific siRNAs, measured 60 hours after transfection**. Error-bars represent the standard-error from three biological replicates for each point. The x-axis reports the different quantities of siRNA oligomers tested. Protein levels were measured using FACS analysis quantifying EGFP protein fluorescence. The error-bars have the length of one standard error.

Supplementary material for Modeling RNA interference in mammalian cells. Results and fitting of in vitro experiments on hamster ovary cell line (CHO) constitutively expressing tTA protein. We measured mRNA levels, by quantitative Real-Time PCR for a large range of concentrations of siRNA oligomers, from 0.001 pmol to 200 pmol (total concentration). The amounts of transfected siRNA oligomers were: 0, 0.001, 0.01, 0.05, 0.1, 0.5, 1.0, 10.0, 20.0, 40.0, 60.0, 80.0, 100.0 and 200.0 pmol in a total of 2 mL of medium (so the final concentrations of siRNA oligomers were 5 × 10^{-4}, 5 × 10^{-3}, 2.5 × 10^{-2}, 5 × 10^{-2}, 2.5 × 10^{-1}, 5 × 10^{-1}, 5.0, 10.0, 20.0, 30.0, 40.0, 50.0, and 100 nM respectively). Each experiment was performed in biological triplicates, and the resulting standard deviations are computed and reported in each graph. In Additional file

Click here for file

RNAi Modeling

We were interested in formulating a model that can quantitatively describe the effects of varying quantities of siRNA oligomers onto the degradation of the target mRNA species, and of its corresponding protein product. A general dynamical model describing transcription of the mRNA species, its siRNA-mediated degradation, and translation of its protein products, can be described by a system of ordinary differential equations (ODEs). Let _{m}, X_{p }
_{s }

The parameter _{m}
_{
m
}, _{s}
_{m}, X_{s}
_{m }
_{s }
_{T }
_{p }
_{
m
}, _{s}

The different models RNA interference models for the RNAi-induced mRNA degradation rate **( X _{s}, X_{m}) **and their corresponding parameters.

Model 1

_{1}: Rate of mRNA-siRNA* complex formation

Model 2

_{2}: Rate of mRNA-siRNA* complex formation

_{2}: Number of siRNA target sites

Model 3

_{3}: Rate of mRNA-siRNA* complex formation

_{3}: Cleavage and dissociation rate of mRNA-siRNA*

_{3}: Number of siRNA target sites

Model 4

_{4}: Maximal degradation rate of the mRNA due to RNAi

_{4}: Michaelis-Menten like constant

_{4}: Number of siRNA target sites

Model 1: The stoichiometric model

One possible way of modeling the effects of RNAi on the mRNA degradation is to consider a stoichiometric reaction between the siRNAs and mRNAs. Let siRNA* denote the concentration of the siRNA-RISC complex, namely the fraction of the siRNAs that are loaded into RISC complexes (step 2 of Figure

Namely, the siRNA-loaded RISC binds to the complementary mRNA and then both are degraded. According to this model, following the law of mass action, we predict that the siRNA mediated degradation will be proportional to the product of the concentration of siRNA oligomers and the targeted mRNA species:

where parameter _{1 }represents the proportionality constant. In this modeling approach, the siRNA-RISC complex is assumed not to be recycled, but the RISC needs to be reloaded before it can degrade another mRNA molecule (i.e. in this model the dashed line linking step 4 to step 3 in Figure

Model 2: Stoichiometric model with co-operativity

This model is a straightforward extension of Model 1, which additionally takes into account the presence of multiple sites on the targeted mRNA where the siRNA-loaded RISC can bind. Model 1 can be easily extended to include co-operativity:

As before, the rate of RNAi-driven degradation, can be easily obtained applying the law of mass action:

where _{2 }is the proportionality constant and _{2 }is the number of siRNA binding sites on the targeted mRNA species. This model was suggested in

Model 3: Enzymatic model

A detailed model of RNAi specific for mammalian cells was proposed by Malphettes _{3}, of siRNA binding sites. The siRNA-RISC complex (siRNA*) can bind to any site on the mRNA to form an intermediate mRNA-siRNA* complex, which can either accommodate further siRNA-RISC complexes on any other free binding sites, or cleave the target mRNA and dissociate from the cleavage products. The reaction of the complex formation of the target mRNA with the siRNA-RISC complex is described as follows (for details refer to

The model considers the following reaction between an intermediate mRNA-siRNA* _{
i-1 }complex with another siRNA-RISC complex (for all _{3}]):

The generic cleavage and degradation reaction of the mRNA by any interacting siRNA-RISC complex (∀_{3}] is represented by:

In _{m}, X_{s}

where _{3 }
_{3 }is the cleavage and dissociation rate of mRNA-siRNA* complex (reaction 8). This functional form, for a constant _{s}
_{m }= c_{3}X_{s }

This is in perfect agreement with the experimental finding on the enzymatic activity of both non-mammalian and mammalian RISC on mRNA degradation, as reported in _{m}

Model 4 basically assumes that only the maximal rate _{m }
_{s}
_{m }
_{m}

Note also that when _{m }
_{m}
_{m}
_{s}, X_{m}) = V_{m }= c_{3}
_{s}

Model 4: Phenomenological model

In

This model, despite being phenomenological has interesting properties. The kinetic parameters _{4 }and _{4 }depend on the efficiency of siRNA binding to its sites on the target mRNA _{4 }represents the maximal degradation rate of the mRNA due to RNA interference; _{4 }the concentration of siRNA oligomers needed to achieve half of the maximal degradation rate. The above equation implies that for _{s }
_{4}, the increase in the RNAi mediated degradation is linear with _{4 }= 1, or to Model 2 for _{4 }= _{2}), while it saturates at higher levels of _{s}
_{4}, differently from Model 3.

Parameter Identification

The four models were fitted to the three mRNA and protein experimental datasets (I, II and III), by searching for the parameter values for which the model-generated data best fitted the experimental data, according to a squared error measure. The results of the fitting procedure for each of the models, together with the optimized values of their parameters, are given in Table

Numerical fitting results of the four models for in vitro experimental data for the EGFP protein and mRNA.

**Experiment on EGFP mRNA levels**

**Fit Err**.

**Pred. Err**.

**Parameters**

**Model 1**

1.00

0.98

_{1 }= 1.38 × 10^{-4}(pmol min)^{-1},

**Model 2**

0.13

0.12

_{2 }= 5.00 × 10^{-3}(pmol^{h2 }min)^{-1},

_{2 }= 0.126,

**Model 3**

1

1

_{3}_{3 }= 1.40 × 10^{-4}(pmol min)^{-1},

_{3}/_{m }= 1.33 × 10^{3}a.u.,

**Model 4**

0.04

0.05

_{4 }= 0.105 pmol,

_{4 }= 8.1 × 10^{-3}min^{-1},

_{4 }= 4.47

**Experiment on EGFP protein levels**

**Fit Err**.

**Pred. Err**.

**Parameters**

**Model 1**

0.97

0.81

_{1 }= 9.90 × 10^{-5}(pmol min)^{-1},

**Model 2**

0.46

0.39

_{2 }= 1.10 × 10^{-3}(pmol^{h2 }min)^{-1},

_{2 }= 0.456,

**Model 3**

1

1

_{3}_{3 }= 1.20 × 10^{-4}(pmol min)^{-1},

_{3}/_{m }^{3}a.u.,

**Model 4**

0.21

0.12

_{4 }= 12.9 pmol,

_{4 }= 8.6 × 10^{-3}min^{-1},

_{4 }= 4.49

The relative value of the error (Fit Error) and the relative value of the prediction error (Pred. Err.) for each model are given, together with the corresponding optimized values of its parameters. The unit of measurements are reported for the dimensional parameters. (a.u. stands for arbitrary units of concentration).

The fitting results for the mRNA levels (set I) are shown in Figure _{4 }in Model 4, is _{4 }= 0.0081 min^{-1}, indicating that the strength of siRNA mediated mRNA degradation is comparable to the strength of basal mRNA degradation (since its value is in the same order of magnitude as the degradation rate of the EGFP mRNA, namely _{m }
^{-1 }

Numerical fitting of the four models on the in vitro experimental results on mRNA

**Numerical fitting of the four models on the in vitro experimental results on mRNA EGFP expression levels presented on Figure 2**. The optimized parameter values and the corresponding fit error of each model are given in Table 2.

Note also that the parameters found for Model 2 include a coefficient _{2 }= 0.126, hence less than unity. Since _{2 }describes the number of siRNA binding sites on the targeted mRNA, it should be greater than, or equal to, 1 in order to have a clear biological interpretation. However, if we constrain this parameter to be greater or equal to one, then the model optimizes at the value of _{2 }= 1, which makes Model 2 identical to Model 1.

We have observed that in all numerical simulations, Models 1 and 3 are almost indistinguishable. The large optimized value of parameter c_{3 }of Model 3 (namely _{3}/_{m }
^{4}), the low value of parameter _{3}
_{3 }= 1.40 × 10^{-4 }suggest _{3 }≫ _{3}
_{3}
_{m}
_{s}, X_{m}
_{3}
_{3}
_{s}X_{m}
_{1 }≈ _{3}
_{3 }in Table

Numerical fitting of the four models of the in vitro experimental results on protein EGFP levels presented on Figure 3

**Numerical fitting of the four models of the in vitro experimental results on protein EGFP levels presented on Figure 3**. The optimized parameter values and the corresponding fit error of each model are given in Table 2.

When we fitted the models to the third experimental dataset (set III), which was performed on a different cell-line, with both a different target mRNA and a different siRNA oligomer, Model 4 still performed better than the others with the smallest error, although Model 2 was a close match. Model 3 and 4 were again behaving very similarly and had the largest error. It should be noted that as it happened with the previous fitting results, also in this case Model 2 has a Hill coefficient smaller than unity (_{2 }= 0.126).

**Finally we observed, as shown in **Additional file

Assessing the model predictive ability

The models described above have the same number of unknown parameters to be learned (Methods), but for Model 4, which has one extra parameter. To be sure that the improved performance of Model 4 in describing the experimental data was not due to overfitting, we computed for each model and for each experimental dataset, the prediction error, which allows to assess the generalisation performance of the models

Conclusions

Our findings show that the simple Hill function described by Model 4 is sufficient to quantitatively describe the effect of RNA interference, at the mRNA and protein level, in mammalian cells in vitro, for varying concentration of siRNA oligomers.

One significant feature of Model 4 is that it can predict the saturation effect of the RNAi process that we observed experimentally. We considered the possibility that this saturation could be in fact due to the inability of the cell to uptake high concentration of siRNA oligomers, however recent experiments

It has been demonstrated in _{s}
_{m}
_{M }
_{m}
_{m}
_{s }

The three parameters of Model 4 have a straightforward biological interpretation, and their values can be easily tuned to accommodate for different efficiencies of RNAi. For example, the parameter _{4 }can be used to weigh the degradation due to the RNAi compared to the endogeneous mRNA degradation, and its strength, i.e. what is the maximal degradation rate that can be achieved. _{4 }quantifies the siRNA oligomers concentration needed to achieve half of the maximal degradation of the targeted mRNA. The _{4 }coefficient can accommodate for multiple target sites on the same mRNA, or for the cooperativity of the RISC complex.

Clearly, the RNAi process is very complex and no one-to-one relationship can be found between parameters of Model 4 and RNAi biological components. Nevertheless, it has been shown in ^{4 }and 10^{5 }siRNA oligomers per cell (corresponding to a concentration in the range 10 pM-100 pM) are sufficient to reach half-maximal mRNA target degradation. Model 4 predicts that half-maximal degradation is achieved for an amount of siRNA oligomers equal to _{4}. The value of this parameter when fitting mRNA levels (Table _{4 }≈ 0.1 pmol despite of the different cell-lines and mRNA-siRNA pairs tested (EGFP and tTA). This value corresponds to a concentration of 50 pM in our experimental setting, hence in good agreement with the previously reported range. Altogether these observations suggest that the quantity _{4 }could be cell-type, mRNA, and siRNA indipendent.

It is estimated that the concentration of active RISC in a cell is about 3 - 5 nM ^{-13}
^{-12}
^{3 }- 10^{4}. The above observations suggest that saturation begins when the number of siRNA oligomers in a cell becomes comparable to the number of RISC molecules.

We observed that the parameters of Model 4 estimated when fitting protein levels (set II experiments) are very close to the ones estimated when fitting mRNA levels (set I experiments). Namely, the optimized values of _{4 }and _{4 }are very similar for both experimental data. This is important since these are two independent biological experiments. This proves the mathematical robustness of Model 4. The only parameter changing between the two sets of experiment is _{4}, which represents the concentration of siRNA oligomers needed to achieve half of the maximal degradation rate (_{4}). This is reflected in Figure

We also conformed that Model 4 is cell-line-independent, mRNA-independent, and siRNA-independent, since it can accurately describe the RNA interference process on a different cell-line (CHO) expressing a different mRNA (tTA), silenced by a different siRNA oligomer.

Interestingly, the difference in Model 4 parameters, when testing a different mRNA-siRNA pair (i.e. tTA versus EGFP), shows that only _{4 }(the maximal degradation rate) and _{4 }(the cooperativity) change significantly, suggesting that these two parameters can be used to describe changes in siRNA-mRNA silencing specific strength, whereas _{4 }may be kept constant.

Recently it has been proposed that siRNA and microRNA efficacy, defined as the percentage decrease in the target mRNA level due to the silencing reaction, could be limited due to mRNA abundance

Model 4 predicts that the percentage decrease in target mRNA level (obtained from Eq. (13) simply dividing by _{m}
_{m}
_{m }
_{m }
_{m}
_{m}
_{m }
_{m}

The models discussed so far consider the average behavior of a population of cells. In the case of singe-cell experiments, these models might not be efficient enough due to their deterministic nature and will not be able to capture any stochastic effects.

Since RNA has a plethora of functional properties and plays many of roles in regulating gene expression, it has been used in a number of different studies as a tool for elucidating gene functions. In fact with RNAi it is possible to selectively knock-down any gene and even modulate its dosage

Methods

RNA interference by small interfering oligonucleotides (siRNA)

The sequence of the 21-mer siRNA double-stranded oligomers targeting EGFP was identical to the one reported in

Cell culture and transfection

HEK 293 stably expressing EGFP (kindly provided by Mara Alfieri) were maintained at 37°C in a 5% CO2-humidified incubator. HEK 293 cells were cultured in Dulbecco's modified Eagle's medium (DMEM, GIBCO BRL) supplemented with 10% heat-inactivated fetal bovine serum (FBS, Invitrogen) and 1% antibiotic/antimycotic solution (GIBCO BRL). CHO AA8 Tet-Off Cell Line (Clontech) stably expressing the tetracycline-controlled transactivator (tTA) were maintained at 37degC in a 5% CO2-humidified incubator. CHO cells were cultured in alpha-MEM (GIBCO BRL) supplemented with 10% heat-inactivated fetal bovine serum (FBS) (Invitrogen) and 1% antibiotic/antimycotic solution (GIBCO BRL). Cells were seeded at a density of 300.000 per well in a 6 wells multi-well and transfected 1 day after seeding using Lipofectamine 2000 (Invitrogen) according to manufacturer's instructions with siRNA (Silencer Custom siRNA, 100 ^{-4}, 5 × 10^{-3}, 2.5 × 10^{-2}, 5 × 10^{-2}, 2.5 × 10^{-1}, 5 × 10^{-1}, 5.0, 10.0, 20.0, 30.0, 40.0, 50.0, and 100 nM respectively). Each experiment was performed in biological triplicates, and the resulting standard deviations were computed and reported in each graph. One day post-transfection, the media and ligand were replaced. Transfected cells were collected 48 hours post-transfection for RNA extraction and subsequent analysis. FACS analysis was performed 60 hours after transfection.

RNA extraction and Real-time PCR

Total RNA extraction from 35 mm culture plates was performed using the Qiagen RNeasy Kit (Qiagen) according to manufacturers instructions. Retro-transcription of 1 ^{®}Reverse Transcription Kit (Qiagen), according to manufacturers instructions. Quantitative real-time PCR was performed using a LightCycler (Roche Molecular Biochemicals, Mannheim, Germany) to analyze the amplification status of EGFP and tTA. Amplification of the genes was performed from the cDNA obtained from the total RNA and using the LightCycler DNA Master SYRB Green I kit (Roche Molecular Biochemicals). Primer sequences for Human GAPDH and Chinese Hampster GAPDH (used as reference genes) were designed by Primer 3.0 ^{-DCt
}. To confirm the specificity of the amplification signal, we considered the primer dissociation curve in each case.

FACS analysis

Cells from 35 mm culture plates were trypsinized, filtered and subjected to Fluorescence-Activated Cell Sorting (FACS) analysis 60 hours posttransfection in a Becton Dickinson FACSAria.

Models

In the context of the specific in vitro experiments we carried out, we can make the following assumptions to derive the mathematical model:

1. Cells express the target mRNA at a constant rate _{m }

2. We assume that the siRNA oligomers will be quickly loaded into the RISC and that step 2 of Figure

Therefore, the steps 2 - 4 of RNA interference mechanism as shown in Figure _{m}, X_{s}

Steady-state equations

For the numerical fitting of the in vitro experiments we used the steady state equations for the mRNAs or proteins. For example, for the in vitro experiments on RNA levels, the experimental period of 48 hours before extracting the RNA is considered long enough for the mRNAs to approach their equilibrium value. In order to solve for the mRNA or protein steady state we assume that siRNA concentration remains constant through the 48 hours of the in vitro experiments. In general, the siRNA-RISC complex, is considered very stable and one can assume that the degradation of siRNA is so slow that it does not have any effect on the overall dynamics. The steady state equations for the mRNA concentrations of the four models are:

The corresponding mRNA equilibrium of the negative control experiments is simply _{m }= k_{m}/d_{m }
_{m}, X_{s}
_{m}/k_{m}
_{m}
_{m }
^{-1}, which corresponds to a half-life of 40 minutes, as estimated in _{m }
_{3}, _{3}, _{3}, _{m}
_{3 }
_{3 }and _{3}/_{m}

where

For the numerical fitting of the ratio of protein levels between negative and positive control, one needs to divide equation (14) by equation (15).

Parameter fitting and Prediction Error

For the numerical fitting of the mRNA levels from in vitro experiments, we used the following error function:

where ^{i }

The absolute value errors of each model were then normalized against the largest error. Namely, the error of Model 3 (which in both case was the largest one) was set to 1 and all the other errors of the remaining three models, were normalized against error of Model 3.

The Prediction Error (PE) for each experimental dataset was computed by repeating the parameter fitting procedure described above, but this time using a leave-one-out cross-validation procedure. The PE was then computed as the average error (Eq. 16) between the predicted value and the experimental value across all the experimental points. As done for the error function, we then computed a relative value for the PE in order to compare the performance across the different models by normalising against the maximum PE across the four models, and reported it in Table

Please observe that in the case of the

Authors' contributions

GC and VS designed the experiments. GC, VS and MG performed the experiments. AP developed the models and AP and DdB conducted simulations. AP, GC and DdB drafted the manuscript. DdB and MdB conceived and supervised the collaboration and overall strategy of the project and edited the manuscript. All authors have read and approved the final manuscript.

Acknowledgements

Thanks to Laura Pisapia for the FACS analysis, S. Giovane and F. Menolascina for technical support and to Mara Alfieri for providing the Hek-EGFP stable clone. Funding for this work has been provided by the Italian Ministry of Research Grant ITALBIONET to DdB.