Open Access Open Badges Research article

New words in human mutagenesis

Alexander Y Panchin12*, Sergey I Mitrofanov1, Andrei V Alexeevski34, Sergey A Spirin34 and Yuri V Panchin23

Author Affiliations

1 Department of Bioengineering and Bioinformatics, Moscow State University, Vorbyevy Gory 1-73, Moscow, 119992, Russian Federation

2 Institute for Information Transmission Problems, Russian Academy of Sciences, Bolshoi Karetny pereulok 19-1, Moscow, 127994, Russian Federation

3 Department of Mathematical Methods in Biology, Belozersky Institute, Moscow State University, Vorbyevy Gory 1-40, Moscow, 119991, Russian Federation

4 Department of Mathematics, Scientific-Research Institute for System Studies, Russian Academy of Sciences, Nakhimovskii prospekt 36-1, Moscow, 117218, Russian Federation

For all author emails, please log on.

BMC Bioinformatics 2011, 12:268  doi:10.1186/1471-2105-12-268

Published: 30 June 2011



The substitution rates within different nucleotide contexts are subject to varying levels of bias. The most well known example of such bias is the excess of C to T (C > T) mutations in CpG (CG) dinucleotides. The molecular mechanisms underlying this bias are important factors in human genome evolution and cancer development. The discovery of other nucleotide contexts that have profound effects on substitution rates can improve our understanding of how mutations are acquired, and why mutation hotspots exist.


We compared rates of inherited mutations in 1-4 bp nucleotide contexts using reconstructed ancestral states of human single nucleotide polymorphisms (SNPs) from intergenic regions. Chimp and orangutan genomic sequences were used as outgroups. We uncovered 3.5 and 3.3-fold excesses of T > C mutations in the second position of ATTG and ATAG words, respectively, and a 3.4-fold excess of A > C mutations in the first position of the ACAA word.


Although all the observed biases are less pronounced than the 5.1-fold excess of C > T mutations in CG dinucleotides, the three 4 bp mutation contexts mentioned above (and their complementary contexts) are well distinguished from all other mutation contexts. This provides a challenge to discover the underlying mechanisms responsible for the observed excesses of mutations.