Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Proceedings of the Third Annual RECOMB Satellite Workshop on Massively Parallel Sequencing (RECOMB-seq 2013)

Open Access Proceedings

Discovering and mapping chromatin states using a tree hidden Markov model

Jacob Biesinger13, Yuanfeng Wang2 and Xiaohui Xie13*

  • * Corresponding author: Xiaohui Xie

  • † Equal contributors

Author Affiliations

1 Department of Computer Science, University of California, Irvine, CA, USA

2 Department of Physics and Astronomy, University of California, Irvine, CA, USA

3 Institute for Genomics and Bioinformatics, University of California, Irvine, CA, USA

For all author emails, please log on.

BMC Bioinformatics 2013, 14(Suppl 5):S4  doi:10.1186/1471-2105-14-S5-S4

Published: 10 April 2013


New biological techniques and technological advances in high-throughput sequencing are paving the way for systematic, comprehensive annotation of many genomes, allowing differences between cell types or between disease/normal tissues to be determined with unprecedented breadth. Epigenetic modifications have been shown to exhibit rich diversity between cell types, correlate tightly with cell-type specific gene expression, and changes in epigenetic modifications have been implicated in several diseases. Previous attempts to understand chromatin state have focused on identifying combinations of epigenetic modification, but in cases of multiple cell types, have not considered the lineage of the cells in question.

We present a Bayesian network that uses epigenetic modifications to simultaneously model 1) chromatin mark combinations that give rise to different chromatin states and 2) propensities for transitions between chromatin states through differentiation or disease progression. We apply our model to a recent dataset of histone modifications, covering nine human cell types with nine epigenetic modifications measured for each. Since exact inference in this model is intractable for all the scale of the datasets, we develop several variational approximations and explore their accuracy. Our method exhibits several desirable features including improved accuracy of inferring chromatin states, improved handling of missing data, and linear scaling with dataset size. The source code for our model is available at http:// webcite