Skip to main content
  • Meeting abstract
  • Open access
  • Published:

Epi-letters - how to describe epigenetic signatures

Background

The concept of a chromatin-based epigenetic code, proposed more than a decade ago, associates specific combinations of chromatin marks with different gene expression states and their maintenance [1]. High-throughput technologies like microarray profiling or next generation sequencing enable us to examine the validity of the concept, by profiling transcriptomes and multiple chromatin marks for many different samples, conditions and organisms. The large amounts of generated data require efficient and instructive computational methods to identify and interpret biologically relevant correlations and to challenge the hypothesis of an epigenetic code.

Results

Here, I introduce a generally applicable bioinformatic method to group epigenetic information across genome-wide chromatin data sets. It automatically classifies the abundance of chromatin-based signals into discrete categories and transforms the categories into so-called epi-letters. Each genomic region can then be represented as a combined string of epi-letters referring to different chromatin marks. This synoptic compilation can be used for further clustering to determine common epigenetic signatures and can be represented applying the concept of the DNA motif sequence logo [2].

I present the results of applying the epi-letter principle using published data [3] from 12 chromatin marks (including DNA methylation) in the model organism Arabidopsis thaliana (an example of representing a chromatin state in Figure 1).

Figure 1
figure 1

An example of sequence logo representation for chromatin state CS3 (derived from [3]) in Arabidopsis, inferred from tiling arrays of 12 chromatin marks. Each stack represents the distribution and information content (in bit scores, blue dash line indicates the maximum bit score) of chromatin signatures with Low, Medium or High intensities within a cluster of regions with similar signatures. The bar below is adapted from [3]: colors indicate the distribution of chromatin marks from 25% (light-) to 100% (dark purple); numbers inside cells indicate the percentage of tiles that are associated with each chromatin mark and assigned to the CS3. The logos are generated using Weblogo 3.2 program [4]. The epi-letters for 5mC are enlarged in the inset.

Conclusions

I propose a new and simple tool for finding and representing epigenetic patterns across genome-wide profiling data of different chromatin marks. I provide a proof-of-concept application with published data, resulting in a classification of epigenetic signatures in Arabidopsis thaliana. The method has also other potentials for de novo discovery and visualization of general genome-wide profiling patterns.

References

  1. Jenuwein T, Allis CD: Translating the histone code. Science 2001, 293(5532):1074–1080. 10.1126/science.1063127

    Article  CAS  PubMed  Google Scholar 

  2. Schneider TD, Stephens RM: Sequence logos: a new way to display consensus sequences. Nucleic acids research 1990, 18(20):6097–6100. 10.1093/nar/18.20.6097

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  3. Roudier F, Ahmed I, Berard C, Sarazin A, Mary-Huard T, Cortijo S, Bouyer D, Caillieux E, Duvernois-Berthet E, Al-Shikhley L, et al.: Integrative epigenomic mapping defines four main chromatin states in Arabidopsis. The EMBO journal 2011, 30(10):1928–1938. 10.1038/emboj.2011.103

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  4. Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: a sequence logo generator. Genome research 2004, 14(6):1188–1190. 10.1101/gr.849004

    Article  PubMed Central  CAS  PubMed  Google Scholar 

Download references

Acknowledgement

I thank my PhD advisors Arndt von Haeseler and Ortrun Mittelsten Scheid for interesting discussions and support. The traveling fellowship for participating SCS8/ISMB2012 from Swiss Bioinformatics Institute is greatly acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huy Q Dinh.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Dinh, H.Q. Epi-letters - how to describe epigenetic signatures. BMC Bioinformatics 13 (Suppl 18), A4 (2012). https://doi.org/10.1186/1471-2105-13-S18-A4

Download citation

  • Published:

  • DOI: https://doi.org/10.1186/1471-2105-13-S18-A4

Keywords