Open Access Highly Accessed Open Badges Research article

Coevolution between simple sequence repeats (SSRs) and virus genome size

Xiangyan Zhao12, Yonglei Tian1, Ronghua Yang2, Haiping Feng2, Qingjian Ouyang2, You Tian2, Zhongyang Tan12*, Mingfu Li12*, Yile Niu3, Jianhui Jiang2, Guoli Shen2 and Ruqin Yu2

Author affiliations

1 Chinese Academy of Inspection and Quarantine, Beijing, 100029, China

2 College of Biology, State Key Laboratory for Chemo/Biosensing and Chemometrics, Hunan University, Changsha, 410082, China

3 College of Environmental Science and Engineering, Hunan University, Changsha, 410082, China

For all author emails, please log on.

Citation and License

BMC Genomics 2012, 13:435  doi:10.1186/1471-2164-13-435

Published: 30 August 2012



Relationship between the level of repetitiveness in genomic sequence and genome size has been investigated by making use of complete prokaryotic and eukaryotic genomes, but relevant studies have been rarely made in virus genomes.


In this study, a total of 257 viruses were examined, which cover 90% of genera. The results showed that simple sequence repeats (SSRs) is strongly, positively and significantly correlated with genome size. Certain repeat class is distributed in a certain range of genome sequence length. Mono-, di- and tri- repeats are widely distributed in all virus genomes, tetra- SSRs as a common component consist in genomes which more than 100 kb in size; in the range of genome < 100 kb, genomes containing penta- and hexa- SSRs are not more than 50%. Principal components analysis (PCA) indicated that dinucleotide repeat affects the differences of SSRs most strongly among virus genomes. Results showed that SSRs tend to accumulate in larger virus genomes; and the longer genome sequence, the longer repeat units.


We conducted this research standing on the height of the whole virus. We concluded that genome size is an important factor in affecting the occurrence of SSRs; hosts are also responsible for the variances of SSRs content to a certain degree.

Simple sequence repeats; Microsatellite; Genome size; Virus genomes; Evolution