Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Highly Accessed Research article

An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies

Michiel E Adriaens12*, Magali Jaillard1, Lars MT Eijssen1, Claus-Dieter Mayer3 and Chris TA Evelo12

Author Affiliations

1 Department of Bioinformatics - BiGCaT, Maastricht University, Maastricht, The Netherlands

2 Netherlands Consortium for Systems Biology (NCSB), University of Amsterdam, The Netherlands

3 Department of Biomathematics and Statistics Scotland, University of Aberdeen, Rowett Institute of Nutrition and Health, Aberdeen, UK

For all author emails, please log on.

BMC Genomics 2012, 13:42  doi:10.1186/1471-2164-13-42

Published: 25 January 2012

Abstract

Background

The combination of chromatin immunoprecipitation with two-channel microarray technology enables genome-wide mapping of binding sites of DNA-interacting proteins (ChIP-on-chip) or sites with methylated CpG di-nucleotides (DNA methylation microarray). These powerful tools are the gateway to understanding gene transcription regulation. Since the goals of such studies, the sample preparation procedures, the microarray content and study design are all different from transcriptomics microarrays, the data pre-processing strategies traditionally applied to transcriptomics microarrays may not be appropriate. Particularly, the main challenge of the normalization of "regulation microarrays" is (i) to make the data of individual microarrays quantitatively comparable and (ii) to keep the signals of the enriched probes, representing DNA sequences from the precipitate, as distinguishable as possible from the signals of the un-enriched probes, representing DNA sequences largely absent from the precipitate.

Results

We compare several widely used normalization approaches (VSN, LOWESS, quantile, T-quantile, Tukey's biweight scaling, Peng's method) applied to a selection of regulation microarray datasets, ranging from DNA methylation to transcription factor binding and histone modification studies. Through comparison of the data distributions of control probes and gene promoter probes before and after normalization, and assessment of the power to identify known enriched genomic regions after normalization, we demonstrate that there are clear differences in performance between normalization procedures.

Conclusion

T-quantile normalization applied separately on the channels and Tukey's biweight scaling outperform other methods in terms of the conservation of enriched and un-enriched signal separation, as well as in identification of genomic regions known to be enriched. T-quantile normalization is preferable as it additionally improves comparability between microarrays. In contrast, popular normalization approaches like quantile, LOWESS, Peng's method and VSN normalization alter the data distributions of regulation microarrays to such an extent that using these approaches will impact the reliability of the downstream analysis substantially.