BMC Bioinformatics

official impact factor 3.03

Open Access Highly Access Research article

Cis-motifs upstream of the transcription and translation initiation sites are effectively revealed by their positional disequilibrium in eukaryote genomes using frequency distribution curves

Kenneth W Berendzen1,2, Kurt Stüber1, Klaus Harter2 and Dierk Wanke1,2*

Author Affiliations

1 Max-Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, D-50829 Köln, Germany

2 Universität Tübingen, ZMBP Pflanzenphysiologie, Auf der Morgenstelle 1, D-72076 Tübingen, Germany

For all author emails, please log on.

BMC Bioinformatics 2006, 7:522 doi:10.1186/1471-2105-7-522

Published: 30 November 2006

Abstract

Background

The discovery of cis-regulatory motifs still remains a challenging task even though the number of sequenced genomes is constantly growing. Computational analyses using pattern search algorithms have been valuable in phylogenetic footprinting approaches as have expression profile experiments to predict co-occurring motifs. Surprisingly little is known about the nature of cis-regulatory element (CRE) distribution in promoters.

Results

In this paper we used the Motif Mapper open-source collection of visual basic scripts for the analysis of motifs in any aligned set of DNA sequences. We focused on promoter motif distribution curves to identify positional over-representation of DNA motifs. Using differentially aligned datasets from the model species Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster and Saccharomyces cerevisiae, we convincingly demonstrated the importance of the position and orientation for motif discovery. Analysis with known CREs and all possible hexanucleotides showed that some functional elements gather close to the transcription and translation initiation sites and that elements other than the TATA-box motif are conserved between eukaryote promoters. While a high background frequency usually decreases the effectiveness of such an enumerative investigation, we improved our analysis by conducting motif distribution maps using large datasets.

Conclusion

This is the first study to reveal positional over-representation of CREs and promoter motifs in a cross-species approach. CREs and motifs shared between eukaryotic promoters support the observation that an eukaryotic promoter structure has been conserved throughout evolutionary time. Furthermore, with the information on positional enrichment of a motif or a known functional CRE, it is possible to get a more detailed insight into where an element appears to function. This in turn might accelerate the in depth examination of known and yet unknown cis-regulatory sequences in the laboratory.