Computational identification of rare codons of Escherichia coli based on codon pairs preference
1 School of Biological Science and Technology, Shenyang Agricultural University, Shenyang 110161, PR China
2 State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing 102206, PR China
BMC Bioinformatics 2010, 11:61 doi:10.1186/1471-2105-11-61Published: 28 January 2010
Codon bias is believed to play an important role in the control of gene expression. In Escherichia coli, some rare codons, which can limit the expression level of exogenous protein, have been defined by gene engineering operations. Previous studies have confirmed the existence of codon pair's preference in many genomes, but the underlying cause of this bias has not been well established. Here we focus on the patterns of rarely-used synonymous codons. A novel method was introduced to identify the rare codons merely by codon pair bias in Escherichia coli.
In Escherichia coli, we defined the "rare codon pairs" by calculating the frequency of occurrence of all codon pairs in coding sequences. Rare codons which are disliked in genes could make great contributions to forming rare codon pairs. Meanwhile our investigation showed that many of these rare codon pairs contain termination codons and the recognized sites of restriction enzymes. Furthermore, a new index (Frare) was developed. Through comparison with the classical indices we found a significant negative correlation between Frare and the indices which depend on reference datasets.
Our approach suggests that we can identify rare codons by studying the context in which a codon lies. Also, the frequency of rare codons (Frare) could be a useful index of codon bias regardless of the lack of expression abundance information.