Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

MotifMap: integrative genome-wide maps of regulatory motif sites for model species

Kenneth Daily12, Vishal R Patel12, Paul Rigor12, Xiaohui Xie12 and Pierre Baldi123*

Author Affiliations

1 Department of Computer Science, University of California Irvine, Irvine, CA 92697 USA

2 Institute for Genomics and Bioinformatics, University of California Irvine, Irvine, CA 92697 USA

3 Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA 92697 USA

For all author emails, please log on.

BMC Bioinformatics 2011, 12:495  doi:10.1186/1471-2105-12-495

Published: 30 December 2011

Abstract

Background

A central challenge of biology is to map and understand gene regulation on a genome-wide scale. For any given genome, only a small fraction of the regulatory elements embedded in the DNA sequence have been characterized, and there is great interest in developing computational methods to systematically map all these elements and understand their relationships. Such computational efforts, however, are significantly hindered by the overwhelming size of non-coding regions and the statistical variability and complex spatial organizations of regulatory elements and interactions. Genome-wide catalogs of regulatory elements for all model species simply do not yet exist.

Results

The MotifMap system uses databases of transcription factor binding motifs, refined genome alignments, and a comparative genomic statistical approach to provide comprehensive maps of candidate regulatory elements encoded in the genomes of model species. The system is used to derive new genome-wide maps for yeast, fly, worm, mouse, and human. The human map contains 519,108 sites for 570 matrices with a False Discovery Rate of 0.1 or less. The new maps are assessed in several ways, for instance using high-throughput experimental ChIP-seq data and AUC statistics, providing strong evidence for their accuracy and coverage. The maps can be usefully integrated with many other kinds of omic data and are available at http://motifmap.igb.uci.edu/ webcite.

Conclusions

MotifMap and its integration with other data provide a foundation for analyzing gene regulation on a genome-wide scale, and for automatically generating regulatory pathways and hypotheses. The power of this approach is demonstrated and discussed using the P53 apoptotic pathway and the Gli hedgehog pathways as examples.