Email updates

Keep up to date with the latest news and content from BMC Systems Biology and BioMed Central.

Open Access Methodology article

Mathematical modeling of translation initiation for the estimation of its efficiency to computationally design mRNA sequences with desired expression levels in prokaryotes

Dokyun Na12, Sunjae Lee1 and Doheon Lee1*

Author Affiliations

1 Department of Bio and Brain Engineering, KAIST, 335-Gwahangno, Yuseong-gu, Daejeon 305-701, Republic of Korea

2 Bioinformatics Research Center, KAIST, 335-Gwahangno, Yuseong-gu, Daejeon 305-701, Republic of Korea

For all author emails, please log on.

BMC Systems Biology 2010, 4:71  doi:10.1186/1752-0509-4-71

Published: 26 May 2010



Within the emerging field of synthetic biology, engineering paradigms have recently been used to design biological systems with novel functionalities. One of the essential challenges hampering the construction of such systems is the need to precisely optimize protein expression levels for robust operation. However, it is difficult to design mRNA sequences for expression at targeted protein levels, since even a few nucleotide modifications around the start codon may alter translational efficiency and dramatically (up to 250-fold) change protein expression. Previous studies have used ad hoc approaches (e.g., random mutagenesis) to obtain the desired translational efficiencies for mRNA sequences. Hence, the development of a mathematical methodology capable of estimating translational efficiency would greatly facilitate the future design of mRNA sequences aimed at yielding desired protein expression levels.


We herein propose a mathematical model that focuses on translation initiation, which is the rate-limiting step in translation. The model uses mRNA-folding dynamics and ribosome-binding dynamics to estimate translational efficiencies solely from mRNA sequence information. We confirmed the feasibility of our model using previously reported expression data on the MS2 coat protein. For further confirmation, we used our model to design 22 luxR mRNA sequences predicted to have diverse translation efficiencies ranging from 10-5 to 1. The expression levels of these sequences were measured in Escherichia coli and found to be highly correlated (R2 = 0.87) with their estimated translational efficiencies. Moreover, we used our computational method to successfully transform a low-expressing DsRed2 mRNA sequence into a high-expressing mRNA sequence by maximizing its translational efficiency through the modification of only eight nucleotides upstream of the start codon.


We herein describe a mathematical model that uses mRNA sequence information to estimate translational efficiency. This model could be used to design best-fit mRNA sequences having a desired protein expression level, thereby facilitating protein over-production in biotechnology or the protein expression-level optimization necessary for the construction of robust networks in synthetic biology.