Table 1

Statistical measures of batch effects and performance evaluation of normalization and batch correction

Dataset

Statistical measure

Raw β

QNβ

Lumi

ABnorm

QNβ+

EB

Lumi+

EB

ABnorm+

EB


Number (%) of CpGs associated with batch at p < 0.01

17,458

(66)

6,466

(24.4)

8,478

(32)

6,926

(26)

12

25

23

2

PCs associated with batch(% variance explained)*

1

(51.6)

1

(17.9)

1

(22.1)

1

(18.9)

None

None

None

Number (%) of differentially methylated CpGs between case and control at p < 0.01

345

(1.3)

759

(2.9)

714

(2.7)

763

(2.9)

1,155

(4.2)

1,146

(4.3)

1,229

(4.6)


Number (%) of CpGs associated with batch at p < 0.01

13,881

(50.0)

10,300

(37.3)

12,668

(46)

9,694

(35.2)

2

6

8

3

PCs associated with batch (% variance explained)

1

(50.4)

1

(24.8)

1

(30.6)

1

(23.8)

None

None

None

Number (%) of differentially methylated CpGs between cancer and normal at p < 0.01

794

(2.9)

1,877

(6.8)

1,131

(4.1)

1,635

(5.9)

2,799

(10.1)

2,400

(8.7)

2,289

(8.3)


Raw β: Raw average β without any correction; QNβ: quantile normalization at average β values; lumi: two step quantile normalization at probe signals implemented in R package "lumi"; ABnorm: quantile normalization for A and B signal separately; EB: Empirical Bayes batch correction. * The principal components (PC) significantly associated with batch effects at p value < 0.01 from the top 10 evaluated by Wilcoxon test and the percentage of variance the PC explains.

Sun et al. BMC Medical Genomics 2011 4:84   doi:10.1186/1755-8794-4-84

Open Data