Open Access Open Badges Research article

Cross-species comparison significantly improves genome-wide prediction of cis-regulatory modules in Drosophila

Saurabh Sinha1*, Mark D Schroeder2, Ulrich Unnerstall2, Ulrike Gaul2 and Eric D Siggia1

Author Affiliations

1 Center for Studies in Physics and Biology, The Rockefeller University, 1230 York Ave, New York, NY10021, USA

2 Laboratory of Developmental Neurogenetics, The Rockefeller University, 1230 York Ave, New York, NY10021, USA

For all author emails, please log on.

BMC Bioinformatics 2004, 5:129  doi:10.1186/1471-2105-5-129

Published: 9 September 2004



The discovery of cis-regulatory modules in metazoan genomes is crucial for understanding the connection between genes and organism diversity. It is important to quantify how comparative genomics can improve computational detection of such modules.


We run the Stubb software on the entire D. melanogaster genome, to obtain predictions of modules involved in segmentation of the embryo. Stubb uses a probabilistic model to score sequences for clustering of transcription factor binding sites, and can exploit multiple species data within the same probabilistic framework. The predictions are evaluated using publicly available gene expression data for thousands of genes, after careful manual annotation. We demonstrate that the use of a second genome (D. pseudoobscura) for cross-species comparison significantly improves the prediction accuracy of Stubb, and is a more sensitive approach than intersecting the results of separate runs over the two genomes. The entire list of predictions is made available online.


Evolutionary conservation of modules serves as a filter to improve their detection in silico. The future availability of additional fruitfly genomes therefore carries the prospect of highly specific genome-wide predictions using Stubb.