Log on / register
Feedback | Support | My details
Open AccessHighly AccessResearch article

Transcriptional programs: Modelling higher order structure in transcriptional control

John E Reid1 email, Sascha Ott2 email and Lorenz Wernisch1 email

MRC Biostatistics Unit, Institute of Public Health, University Forvie Site, Robinson Way, Cambridge CB2 0SR, UK

Systems Biology Centre, Coventry House, University of Warwick, Coventry CV4 7AL, UK

author email corresponding author email

BMC Bioinformatics 2009, 10:218doi:10.1186/1471-2105-10-218

Published: 16 July 2009

Abstract

Background

Transcriptional regulation is an important part of regulatory control in eukaryotes. Even if binding motifs for transcription factors are known, the task of finding binding sites by scanning sequences is plagued by false positives. One way to improve the detection of binding sites from motifs is by taking cooperativity of transcription factor binding into account. We propose a non-parametric probabilistic model, similar to a document topic model, for detecting transcriptional programs, groups of cooperative transcription factors and co-regulated genes. The analysis results in transcriptional programs which generalise both transcriptional modules and TF-target gene incidence matrices and provide a higher-level summary of these structures. The method is independent of prior specification of training sets of genes, for example, via gene expression data. The analysis is based on known binding motifs.

Results

We applied our method to putative regulatory regions of 18,445 Mus musculus genes. We discovered just 68 transcriptional programs that effectively summarised the action of 149 transcription factors on these genes. Several of these programs were significantly enriched for known biological processes and signalling pathways. One transcriptional program has a significant overlap with a reference set of cell cycle specific transcription factors.

Conclusion

Our method is able to pick out higher order structure from noisy sequence analyses. The transcriptional programs it identifies potentially represent common mechanisms of regulatory control across the genome. It simultaneously predicts which genes are co-regulated and which sets of transcription factors cooperate to achieve this co-regulation. The programs we discovered enable biologists to choose new genes and transcription factors to study in specific transcriptional regulatory systems.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.