Epigenetic domains found in mouse embryonic stem cells via a hidden Markov model
1 Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, USA
2 Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
BMC Bioinformatics 2010, 11:557 doi:10.1186/1471-2105-11-557Published: 12 November 2010
Epigenetics is an important layer of transcriptional control necessary for cell-type specific gene regulation. Recent studies have shown significant epigenetic patterns associated with developmental stages and diseases. However, previous studies have been mostly limited to focal epigenetic patterns, whereas methods for analyzing large-scale organizations are still lacking.
We developed a hidden Markov model (HMM) approach for detecting the types and locations of epigenetic domains from multiple histone modifications. We used this method to analyze a published ChIP-seq dataset of five histone modification marks (H3K4me2, H3K4me3, H3K27me3, H3K9me3, and H3K36me3) in mouse embryonic stem (ES) cells. We identified three types of domains, corresponding to active, non-active, and null states. In total, our three-state HMM identified 258 domains in the mouse genome containing 9.6 genes on average. These domains were validated by a number of criteria. The largest domains correspond to olfactory receptor (OR) gene clusters. Each Hox gene cluster also forms a separate epigenetic domain. We found that each type of domain is associated with distinct biological functions and structural changes during early cell differentiation.
The HMM approach successfully detects domains of consistent epigenetic patterns from ChIP-seq data, providing new insights into the role of epigenetics in long-range gene regulation.