Dataset 1 before and after normalization and batch effect correction. A: PCA plot for all 93 samples using all 27,578 CpGs. Different colors are for different batches. Nine pairs of technical replicates are marked as R1 to R9. The samples on Chip12 (circled with dash line) tend to separate from other samples. B. Density plot of samples from Chip12 and Chip26 shows minor distribution biases between the two chips. C: PCA plot of the 24 samples from Chip22 and 26 using all 27,578 CpGs. Two samples with an across bar are technical replicates. D: Box plot of pair-wise CpG errors between 9 pairs of technical replicates for unnormalized average β (red), QNβ (green), lumi (blue), and ABnorm (cyan). The unnormalized data has wider interquartile ranges and shifted medians from zero line. All normalized data have condensed interquartile ranges with medians adjusted close to zero line. E: Error means (lower pane) and average absolute deviations (upper panel) of 9 pairs of technical replicates before (red) and after three normalizations. Unnormalized data has the largest average absolute deviation for each of replicate pairs and shifted mean for most of the pairs. All normalized data show reduced average absolute deviations. F: Error means (lower pane) and average absolute deviations (upper panel) of 9 pairs of technical replicates before and after three normalizations plus EB correction. The normalized and EB correction data have almost identical error means and average absolute deviations compared to normalized data alone.
Sun et al. BMC Medical Genomics 2011 4:84 doi:10.1186/1755-8794-4-84