This article is part of the supplement: Neural Information Processing Systems (NIPS) workshop on New Problems and Methods in Computational Biology
Prediction of tissue-specific cis-regulatory modules using Bayesian networks and regression trees
1 McGill Centre for Bioinformatics. 3775 University Street, room 332, Montreal, Quebec, Canada, H3A 2B4
2 Department of Computer Science and Engineering, University of Washington, Seattle, WA 98105, USA
BMC Bioinformatics 2007, 8(Suppl 10):S2 doi:10.1186/1471-2105-8-S10-S2Published: 21 December 2007
In vertebrates, a large part of gene transcriptional regulation is operated by cis-regulatory modules. These modules are believed to be regulating much of the tissue-specificity of gene expression.
We develop a Bayesian network approach for identifying cis-regulatory modules likely to regulate tissue-specific expression. The network integrates predicted transcription factor binding site information, transcription factor expression data, and target gene expression data. At its core is a regression tree modeling the effect of combinations of transcription factors bound to a module. A new unsupervised EM-like algorithm is developed to learn the parameters of the network, including the regression tree structure.
Our approach is shown to accurately identify known human liver and erythroid-specific modules. When applied to the prediction of tissue-specific modules in 10 different tissues, the network predicts a number of important transcription factor combinations whose concerted binding is associated to specific expression.