Abstract
Background
It is well known that DNA methylation, as an epigenetic factor, has an important effect on gene expression and disease development. Detecting differentially methylated loci under different conditions, such as cancer types or treatments, is of great interest in current research as it is important in cancer diagnosis and classification. However, inappropriate testing approaches can result in large false positives and/or false negatives. Appropriate and powerful statistical methods are desirable but very limited in the literature.
Results
In this paper, we propose a nonparametric method to detect differentially methylated loci under multiple conditions for Illumina Array Methylation data. We compare the new method with other methods using simulated and real data. Our study shows that the proposed one outperforms other methods considered in this paper.
Conclusions
Due to the unique feature of the Illumina Array Methylation data, commonly used statistical tests will lose power or give misleading results. Therefore, appropriate statistical methods are crucial for this type of data. Powerful statistical approaches remain to be developed.
Availability
R codes are available upon request.
Background
It is well known that DNA methylation has important effects on transcriptional regulation, chromosomal stability, genomic imprinting, and Xinactivation [1,2]. It has been also shown to be associated with many human diseases, such as various types of cancer [311].
With the advances of BeadArray technology, genomewide highthroughput methylation data can be easily generated by Illumina GoldenGate and Infinium Methylation Assays. After preprocessing steps, such as background correction and normalization, are applied to the raw fluorescent intensities, for each locus, from about 30 replicates in the same array a summarized βvalue is generated as follows: , where M is the average signal from a methylated allele while U is that from unmethylated allele. The β values are continuous numbers between 0 and 1, with 0 stands for totally unmethylated and 1 for completely methylated.
It has been shown that the β value is rarely normally distributed [9,12,13]. Therefore the commonly used ttest for case control designs or ANOVA for multiple conditions are not the most powerful approaches when detecting differentially methylated loci. Observing this, Wang has proposed a modelbased likelihood ratio test to detect differentially methylated loci for case and control data under the assumption that the β value follows a threecomponent normaluniform distribution [9]. Wang showed that for some situations, their proposed test was better than the simple ttest based on simulation studies.
However, in their method, Wang did not consider the effect of age, which has been shown highly associated with methylation [14,15]. Noticing the importance of age effect, one may use a linear regression with age included as a covariate when analyze methylation data with multiple conditions, such as cancer types. However, the underlying assumption of equal variances may not be satisfied [12]. Therefore the commonly used linear regression method may not be appropriate.
In this paper, we consider methylation data with multiple conditions and propose a nonparametric method which incorporates the age effect in a way through the idea of combining pvalues from independent tests [12,16,17]. More specifically, we first group subjects into several age groups based on their age; then for each age group, a nonparametric KruskalWallis test is conducted for the given locus and the pvalue is recorded. An overall pvalue for that locus will be estimated through combining the pvalues from all age groups. Using a real methylation data with three conditions and a simulation study, we show that the proposed test is more powerful than other methods, including linear regression.
Method
Proposed method
Assume there are K conditions and G age groups. For each age group g (g = 1,2,...,G), we apply the nonparametric KruskalWallis test and obtain a pvalue , then the overall pvalue can be estimated by Fisher test [18]:
Combined ANOVA test
Similarly, we can use ANOVA to replace KW test for each age group and obtain an overall pvalue with being replaced by the pvalue from ANOVA test:
Combined median test
Another nonparametric test is median test using the following statistic for each age group:
, where A_{k }is the number of times that the ranks of individual observations from group k which excess the median from the pooled data, and n_{k }is the sample size of group k. When the sample sizes are large, under the null hypothesis that all samples have the same median, the statistic M has a chisquare distribution with K1 degrees of freedom. The overall pvalue from the combined median test can be calculated:
Combined welch test
We also consider the nonparametric Welch test. For each age group, we have the test statistic [19]:
Under the null hypothesis, the statistic W is asymptotically distributed as Fdistribution with K1 and degrees of freedom. Welch test is an improvement of the Cochran test [20] which usually has inflated type I error rate especially for small sample sizes [19,21]. The overall pvalue from the combined Welch test is:
Methods for combining pvalues
Besides the Fisher method mentioned above, we also consider Ztest to combine pvalues from independent tests. First we calculated the weighted Z statistic using individual pvalues from each age group: , where n_{g }is the total sample size in age group g and Φ is the cumulative distribution function (CDF) of the standard normal distribution. It is easy to see that this statistic has standard normal distribution under the null hypothesis. The overall pvalue is calculated by 1 Φ(Z). Note that here we use onesided test to obtain the overall pvalue.
Simulation settings
To compare each method applied to an individual age group, we simulate β value for three treatment groups based on beta distribution with parameters a and b, beta (a,b), and truncated normal distribution on (0,1) with parameters μ, σ^{2}, TN(μ, σ^{2}). We assume the sample sizes (denoted as s in Tables 1, 2 for the simulation results) for the three treatments are either balanced: s = 30 for each, or nonbalanced: s = 20, 30, and 40. First we compare the estimated type I error rates with the given significance level of 0.05 under the null hypothesis of no differences among treatment groups. Then we compare the empirical powers from each method under various situations. The empirical power is the proportion of rejected null hypothesis to the number of replicates.
Table 1. Estimated type I error rates at significance level 0.05 with 10000 replicates.
Table 2. Empirical power at significance level 0.05 with 10000 replicates.
A real data set
We will use a real methylation data set, the United Kingdom Ovarian Cancer Population Study (UKOPS) [15] with 274 controls, 131 pretreatment cases, and 135 post treatment cases, to compare the performance of the proposed test with others. Those methylation data were generated by the Illumina Infinium Huamn Methylaytion27 BeadChip and can be downloaded under accession number GSE19711 from the NCBI Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo webcite).
For this data set, there are 27578 loci. After data quality control process, we removed some subjects with BS values less than 4000 or the coverage rates less than 95%. We also separate subjects into 6 age groups (5055, 5560, 6065, 6570, 7075, and 75 and over). Table 3 gives the numbers of subjects in each age by treatment groups. For each locus, we perform the above mentioned approaches.
Table 3. Number of samples in age group by treatment group used in the paper after removing subjects with bs <4000 or coverage rate <95% or age >80.
Results
Simulation results
Table 1 reports the estimated type I error rates from each method under different conditions. For most of the time, the estimated type I error rates are close to the nominal significance level as expected. Table 2 gives the empirical powers from each method. It can be seen that the nonparametric method of Mood's median test usually has the lowest powers in the simulations. None of the ANOVA, Welch and KW tests is uniformly most powerful. In words, their performances depend on the distributions from which the data are generated. From our simulation study, the KW test is usually as powerful as or more powerful than the ANOVA test. The true distributions of the β value may vary from locus to locus; it is impossible to simulate all possible distributions. However, based on the observation of the real data, we know that the distributions of the β value are far from the normal distribution, under which ANOVA is the best test. Therefore, we prefer nonparametric tests which are more robust.
Results from real data set
For the real data set, we applied the above mentioned methods to get the overall pvalues (either using Fisher or Z test to combine pvalues from individual age groups) for each locus. Then we use various cutoff pvalues, 0.001, 0.0001, 0.00001, and 0.000001, to count how many loci have smaller pvalues for each method. Table 4 reports the results. We can see that the KW method usually finds more significant loci than other methods. It also shows that the two combining pvalue methods, Fisher and Z test have similar performances, although Z test usually give a little bit more significant loci expect for the Median test. Figure 1 plots the negative log10 pvalues from pairs of the methods. It shows that the KW method gives smaller pvalues especially when the differences among the three treatment groups are not large (e.g., the negative log10 pvalues between 3 and 6). From Figure 1 we can see that for a given cutoff pvalue, most of the loci identified by ANOVA test or Median were also detected by the Welch test; in turn, most of the loci identified by Welch test were also detected by the KW test. This indicates the KW test is more powerful than other methods compared.
Table 4. Number of significant differentially methylated loci detected for given cutoff pvalue based on the real data.
Figure 1. Negative log10(pvalue) from pair of methods. Negative log10 pvalues from pair of methods. (a) Combined ANOVA test vs. combined median test both using Fisher methods to combine pvalues. (b) Combined ANOVA test vs. combined Welch test both using Fisher methods to combine pvalues. (c) Combined ANOVA test vs. combined KruskalWillas test both using Fisher methods to combine pvalues. (d) Combined KW test using Z test vs. combined KW test using Fisher to combine pvalues.
Discussion and conclusions
Due to the unique feature of the β value of methylation data, traditional statistical methods, such as linear regression and ANOVA test may not be appropriate. It has been shown that methylation is highly correlated with age; ignoring age effect may cause many false positives and/or false negatives. The effect of age may also not be linear; therefore we need a better way to account for this effect. In this paper, we use pvalue combination method to deal with age effect. For each age group, we use nonparametric method to compare the treatment groups. It is important to find powerful and robust nonparametric methods for this sort of data. Although we found that KW method is more powerful than some other nonparametric methods for methylation data, it is desirable to find more powerful tests in this area. Furthermore, we want to point out that there are many other methods can be used to combine pvalues [22,23]; it may also be possible to find a more powerful method to combine pvalues for Illumina Array Methylation data. However, based on our experiences, Fisher test is more robust and can be used in situations when a small portion of the pvalues are very small; while the Z test is more powerful when the effect sizes are similar (e.g., the pvalues don't differ much) for all of the age groups. Finally, although in this paper we use different cutoff pvalues to compare the performance of tests, one may want to control the false positive rate. Several multiple comparison methods have been proposed for large scale data set to deal with the situations where the variables (loci) are not independent [2428]. However, it remains to study which approach is more appropriate for the methylation data.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
ZC devised the basic idea of the new method and drafted the manuscript; HH, JL participated in study design data analysis; HKTN, SN, XH and YD assisted the study and cowrote the manuscript. All authors read and approve the final manuscript.
Acknowledgements
This article has been published as part of BMC Medical Genomics Volume 6 Supplement 1, 2013: Proceedings of the 2011 International Conference on Bioinformatics and Computational Biology (BIOCOMP'11). The full contents of the supplement are available online at http://www.biomedcentral.com/bmcmedgenomics/supplements/6/S1. Publication of this supplement has been supported by the International Society of Intelligent Biological Medicine.
References

Kuan PF, Wang S, Zhou X, Chu H: A statistical framework for Illumina DNA methylation arrays.
Bioinformatics 2010, 26(22):2849. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Rakyan VK, Down TA, Thorne NP, Flicek P, Kulesha E, Gräf S, Tomazou EM, Bäckdahl L, Johnson N, Herberth M: An integrated resource for genomewide identification and analysis of human tissuespecific differentially methylated regions (tDMRs).
Genome research 2008, 18(9):15181529. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Baylin SB, Ohm JE: Epigenetic gene silencing in cancera mechanism for early oncogenic pathway addiction?
Nature Reviews Cancer 2006, 6(2):107116. PubMed Abstract  Publisher Full Text

Feinberg AP, Tycko B: The history of cancer epigenetics.
Nature Reviews Cancer 2004, 4(2):143153. PubMed Abstract  Publisher Full Text

Jabbari K, Bernardi G: Cytosine methylation and CpG, TpG (CpA) and TpA frequencies.
Gene 2004, 333:143149. PubMed Abstract  Publisher Full Text

Jones PA, Baylin SB: The fundamental role of epigenetic events in cancer.
Nature Reviews Genetics 2002, 3(6):415428. PubMed Abstract  Publisher Full Text

Kulis M, Esteller M: DNA methylation and cancer.
Adv Genet 2010, 70:2756. PubMed Abstract  Publisher Full Text

Laird PW: Principles and challenges of genomewide DNA methylation analysis.
Nature Reviews Genetics 2010, 11(3):191203. PubMed Abstract  Publisher Full Text

Wang S: Method to detect differentially methylated loci with casecontrol designs using Illumina arrays.
Genetic Epidemiology 2011, 35(December):686694. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Xu GL, Bestor TH, Bourc'his D, Hsieh CL, Tommerup N, Bugge M, Hulten M, Qu X, Russo JJ, ViegasPéquignot E: Chromosome instability and immunodeficiency syndrome caused by mutations in a DNA methyltransferase gene.
Nature 1999, 402(6758):187191. PubMed Abstract  Publisher Full Text

Hansen KD, Timp W, Bravo HC, Sabunciyan S, Langmead B, McDonald OG, Wen B, Wu H, Liu Y, Diep D: Increased methylation variation in epigenetic domains across cancer types.
Nature Genetics 2011, 43(8):768775. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Chen Z, Liu Q, Nadarajah S: A new statistical approach to detecting differentially methylated loci for case control Illumina array methylation data.
Bioinformatics 2012, 28(8):11091113. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Huang H, Chen Z: Age adjusted nonparametric detection of differential DNA methylation with casecontrol designs.

Christensen BC, Houseman EA, Marsit CJ, Zheng S, Wrensch MR, Wiemels JL, Nelson HH, Karagas MR, Padbury JF, Bueno R: Aging and environmental exposures alter tissuespecific DNA methylation dependent upon CpG island context.
PLoS genetics 2009, 5(8):e1000602. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Teschendorff AE, Menon U, GentryMaharaj A, Ramus SJ, Weisenberger DJ, Shen H, Campan M, Noushmehr H, Bell CG, Maxwell AP: Agedependent DNA methylation of genes that are suppressed in stem cells is a hallmark of cancer.
Genome research 2010, 20(4):440446. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Chen Z, Ng HKT: A robust method for testing association in genomewide association studies.
Human Heredity 2012, 73(1):2634. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Chen Z: A new association test based on Chisquare partition for casecontrol GWA studies.

Welch B: On the comparison of several mean values: An alternative approach.
Biometrika 1951, 38(3/4):330336. Publisher Full Text

Cochran WG: Problems arising in the analysis of a series of similar experiments.
Journal of Royal Statistical Society, Series C: Applied Statistics 1937, 4:102118.

Chen Z, Ng HKT, Nadarajah S: A note on Cochran test for homogeneity in oneway ANOVA and metaanalysis.

Chen Z: Is the weighted ztest the best method for combining probabilities from independent tests?
Journal of Evolutionary Biology 2011, 24(4):926930. PubMed Abstract  Publisher Full Text

Chen Z, Nadarajah S: Comments on 'Choosing an optimal method to combine pvalues' by Sungho Won, Nathan Morris, Qing Lu and Robert C. Elston, Statistics in Medicine 2009; 28: 15371553.
Statistics in Medicine 2011, 30(24):29592961. PubMed Abstract  Publisher Full Text

Dudbridge F, Gusnanto A: Estimation of significance thresholds for genomewide association scans.
Genet Epidemiol 2008, 32(3):227234. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Gao X, Starmer J, Martin ER: A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms.
Genet Epidemiol 2008, 32(4):361369. PubMed Abstract  Publisher Full Text

Moskvina V, Schmidt KM: On multipletesting correction in genomewide association studies.
Genet Epidemiol 2008, 32(6):567573. PubMed Abstract  Publisher Full Text

Pe'er I, Yelensky R, Altshuler D, Daly MJ: Estimation of the multiple testing burden for genomewide association studies of nearly all common variants.
Genet Epidemiol 2008, 32(4):381385. PubMed Abstract  Publisher Full Text

Chen Z, Liu Q: A new approach to account for the correlations among single nucleotide polymorphisms in genomewide association studies.
Human Heredity 2011, 72(1):19. PubMed Abstract  Publisher Full Text  PubMed Central Full Text