Relative comparison of different batch effects in Affymetrix GeneChip data. LEFT, Comparison of the numbers of differentially expressed genes in common between MCF7 and MCF10A triplicate samples depending upon what variables were included in the model used for the ComBat correction ('amp' = amplification, 'lab' = labelling, and 'scn' = scanner). Red points are mean counts, error bars are standard deviations, and significance of increased counts compared to no ComBat correction are indicated ('**' for p < 0:05 and '***' for p < 0:01, based on two-tailed t-tests). RIGHT, Pairwise Pearson correlation heatmaps of MCF7 and MCF10A samples compensating for 1, 2, or 3 sources of batch effect using ComBat. Green and blue colour-bars denote MCF7 and MCF10A samples, respectively. The lightest colours denote un-amplified samples, slightly darker are amplified samples, and the darkest colours are the scanner/labelling comparison. A, all data treated as a single group without batch-correction; B, batch-correction for amplification; C, batch-correction for amplification and an alternative labelling method; D, batch correction for amplification, labelling and different scanners used.
Kitchen et al. BMC Genomics 2011 12:589 doi:10.1186/1471-2164-12-589