Email updates

Keep up to date with the latest news and content from BMC Research Notes and BioMed Central.

Open Access Short Report

Two common profiles exist for genomic oligonucleotide frequencies

Shang-Hong Zhang* and Lei Wang

Author Affiliations

Key Laboratory of Gene Engineering of Ministry of Education, and Biotechnology Research Center, Sun Yat-sen University, Guangzhou, 510275, China

For all author emails, please log on.

BMC Research Notes 2012, 5:639  doi:10.1186/1756-0500-5-639

Published: 17 November 2012

Abstract

Background

It was reported that there is a majority profile for trinucleotide frequencies among genomes. And further study has revealed that two common profiles, rather than one majority profile, exist for genomic trinucleotide frequencies. However, the origins of the common/majority profile remain elusive. Moreover, it is not clear whether the features of common profile may be extended to oligonucleotides other than trinucleotides.

Findings

We analyzed 571 prokaryotic genomes (chromosomes) and some selected eukaryotic nuclear genomes as well as other genetic systems to study their compositional features. We found that there are also two common profiles for genomic oligonucleotide frequencies: one is from low-GC content genomes, and the other is from high-GC content genomes. Furthermore, each common profile is highly correlated to the average profile of random sequences with corresponding GC content and generated according to first-order symmetry.

Conclusions

The causes for the existence of two common profiles would mainly be GC content variations and strand symmetry of genomic sequences. Therefore, both GC content and strand symmetry would play important roles in genome evolution.

Keywords:
Genomic sequence; Oligonucleotide; Frequency profile; GC content; Strand symmetry