Abstract
Background
Highthroughput profiling of DNA methylation status of CpG islands is crucial to understand the epigenetic regulation of genes. The microarraybased Infinium methylation assay by Illumina is one platform for lowcost highthroughput methylation profiling. Both Betavalue and Mvalue statistics have been used as metrics to measure methylation levels. However, there are no detailed studies of their relations and their strengths and limitations.
Results
We demonstrate that the relationship between the Betavalue and Mvalue methods is a Logit transformation, and show that the Betavalue method has severe heteroscedasticity for highly methylated or unmethylated CpG sites. In order to evaluate the performance of the Betavalue and Mvalue methods for identifying differentially methylated CpG sites, we designed a methylation titration experiment. The evaluation results show that the Mvalue method provides much better performance in terms of Detection Rate (DR) and True Positive Rate (TPR) for both highly methylated and unmethylated CpG sites. Imposing a minimum threshold of difference can improve the performance of the Mvalue method but not the Betavalue method. We also provide guidance for how to select the threshold of methylation differences.
Conclusions
The Betavalue has a more intuitive biological interpretation, but the Mvalue is more statistically valid for the differential analysis of methylation levels. Therefore, we recommend using the Mvalue method for conducting differential methylation analysis and including the Betavalue statistics when reporting the results to investigators.
Background
Methylation of cytosine bases in DNA CpG islands is an important epigenetic regulation mechanism in the organ development, aging and different disease statuses [1]. Hypermethylation of CpG islands located in the promoter regions of tumor suppressor genes has been firmly established as one of the most common mechanisms for gene regulation in cancer [2,3]. Therefore, highthroughput profiling of DNA methylation status of CpG islands is crucial for forwarding our understanding of the influence of epigenomics [46]. Microarraybased Illumina Infinium methylation assay has been recently used in epigenomic studies [79] due to its high throughput, good accuracy, small sample requirement and relatively low cost [1].
To estimate the methylation status, the Illumina Infinium assay utilizes a pair of probes (a methylated probe and an unmethylated probe) to measure the intensities of the methylated and unmethylated alleles at the interrogated CpG site [10]. The methylation level is then estimated based on the measured intensities of this pair of probes. To date, two methods have been proposed to measure the methylation level. The first one is called Betavalue, ranging from 0 to 1, which has been widely used to measure the percentage of methylation. This is the method currently recommended by Illumina [11,12]. The second method is the log2 ratio of the intensities of methylated probe versus unmethylated probe [13]. We have referred to it as the Mvalue method because it has been widely used in the mRNA expression microarray analysis. Since both methods have their own strengths and limitations, understanding the performance characteristics of both measures is very important in providing the best methylation analysis. We found some studies that optimized clustering methylation data using the Betavalue [14] method; but a rigorous comparison of the two methods has not been done. For this reason, we designed a titration experiment to compare and evaluate these two methods. In the following sections, we will first define these two methods and derive the relationship between them. Then we will evaluate the performance of these two methods in detecting differentially methylated CpG sites.
Results
Definition of Betavalue and Mvalue
The Betavalue is the ratio of the methylated probe intensity and the overall intensity (sum of methylated and unmethylated probe intensities). Following the notation used by Illumina methylation assay [12], Betavalue for an i^{th }interrogated CpG site is defined as:
where y_{i,menty} and y_{i,unmenty} are the intensities measured by the i^{th }methylated and unmethylated probes, respectively. To avoid negative values after background adjustment, any negative values will be reset to 0. Illumina recommends adding a constant offset α (by default, α = 100) to the denominator to regularize Beta value when both methylated and unmethylated probe intensities are low. The Betavalue statistic results in a number between 0 and 1, or 0 and 100%. Under ideal conditions, a value of zero indicates that all copies of the CpG site in the sample were completely unmethylated (no methylated molecules were measured) and a value of one indicates that every copy of the site was methylated. If we assume the probe intensities are Gamma distributed, then the Betavalue follows a Beta distribution. For this reason, it has been named the Betavalue.
The Mvalue is calculated as the log2 ratio of the intensities of methylated probe versus unmethylated probe as shown in Equation 2:
Here we slightly modified the definition given in [13] by adding an offset α (by default, α = 1) to the intensity values to prevent unexpected big changes due to small intensity estimation errors, since for very small intensity values (especially between 0 and 1), small changes of the methylated and unmethylated probe intensities can result in large changes in the Mvalue. A Mvalue close to 0 indicates a similar intensity between the methylated and unmethylated probes, which means the CpG site is about halfmethylated, assuming that the intensity data has been properly normalized by Illumina GenomeStudio or some other external normalization algorithm. Positive Mvalues mean that more molecules are methylated than unmethylated, while negative Mvalues mean the opposite. The Mvalue has been widely used in expression microarray analysis, especially twocolor microarray analysis. Therefore, many existing microarray statistical frameworks using an Mvalue method can also be applied to methylation data analysis.
Relationship between Betavalue and Mvalue
For Illumina methylation data, typically more than 95% of interrogated CpG sites have intensities (y_{i,unmethy}+y_{i,methy}) larger than 1000 (our evaluation dataset had 99.8% interrogated CpG sites with intensities higher than 1000.). Therefore, the relatively small offset value (i.e., 100) in the denominator of Equation 1 has negligible effect on the Betavalue for most interrogated CpG sites. Similarly, the offset α in Equation 2 is also ignorable for most interrogated CpG sites. Based on this observation, the relationship between Betavalue and Mvalue can be derived by substitution using Equation 1 and 2 (with the offset ignored):
Equation 3 indicates that the relationship is a logistic function (shown as a base 2 logarithm instead of natural logarithm). Figure 1 shows the relationship curve between Beta and Mvalues. For example, Betavalues of 0.2, 0.5 and 0.8 correspond to Mvalues of 2, 0 and 2, respectively. An approximately linear relationship can be observed between Betavalue and Mvalue in the middle range (from 0.2 to 0.8 for Betavalues and from 2 to 2 for Mvalues). As shown in Figure 1, Betavalues are severely compressed at the extremes when compared with Mvalues. As shown in the following sections, the transformation of Betavalue into Mvalue provides a straightforward method for using the Betavalue statistic and obtaining the unique statistical properties of the Mvalue.
Figure 1. The relationship curve between Mvalue and Betavalue.
Histograms of Betavalue and Mvalue
Figure 2 shows histograms of Betavalues and Mvalues for a typical sample measured by the Illumina Infinium HumanMethylation27 BeadChip, which interrogates 27,578 CpG sites in total, spread across promoter regions of 14,495 genes. The range of Betavalues is between 0 and 1, which can be interpreted as the approximation of the percentage of methylation for the population of a given CpG site in the sample. For Mvalues, it is difficult to directly infer the degree of methylation based on a single Mvalue, especially considering the range of Mvalues may change across different datasets. The histogram of Mvalues clearly shows a bimodal distribution, with one positive mode (methylated mode) and one negative mode (unmethylated mode). Conversely, because Betavalues are severely compressed in the low (between 0 and 0.2) and high (between 0.8 and 1) ranges compared with the Mvalue statistic, its bimodal distribution is less obvious. Therefore, the Betavalue has a direct correspondence with an intuitive mental model of methylation (% methylation for a given site) whereas the Mvalue may provide some insight into the distribution of methylation across the genome that is difficult to visualize with the Betavalue. See the Conclusions section for additional discussion of this point.
Figure 2. The histograms of Betavalue (left) and Mvalue (right) (27578 interrogated CpG sites in total).
The distribution of standard deviation across different methylation levels
In highthroughput statistical data analyses, many of them, like canonical linear models or ANOVA, assume the data is homoscedastic, i.e., the variable variances are approximately constant. The violation of this assumption, which is described as heteroscedasticity in statistics, imposes serious challenges when applying these analyses to highthroughput data [15]. A common way to check the homoscedasticity of the data is by visualizing the relations between mean and standard deviation [15,16]. Figure 3 shows the mean and standard deviation relations of the Betavalue and Mvalue, which were calculated based on technical replicates. The red dots represent the median standard deviation within a local window. The data was first ranked by mean methylation levels, and then binned into twenty nonoverlapping windows, with each bin containing 5% of the data. The standard deviation of Betavalue is greatly compressed in the low (between 0 and 0.2) and high (between 0.8 and 1) ranges. This means Betavalue has significant heteroscedasticity in the low and high methylation range. The problem of heteroscedasticity is effectively resolved after transforming Betavalue to Mvalue using Equation 3. We can see Mvalue is approximately homoscedastic. Its standard deviation is approximately constant across the entire methylation range for Mvalues. The Mvalue statistic is therefore much more appropriate for the homoscedastic assumptions of most statistical models used for microarray analysis. It should be noted that other variance stabilization transformation methods may also be used to transform the Betavalue and stabilize the variance.
Figure 3. The mean and standard deviation relations of technical replicates. Betavalue (left) and Mvalue (right).
Performance comparison between Beta and Mvalues
Evaluation dataset
Titration data has been widely used to evaluate the performance of new methods for analyzing mRNA expression microarrays [16,17]. To apply this practice to methylation analysis, we designed a methylation titration experiment that enables the evaluation of the performance of the Betavalue and Mvalue methylation analysis methods. Similar to the titration design using Goldengate methylation chips by Bibikova and et al. [12], we selected two samples known to contain significant methylation differences. Sample A is a Blymphocyte sample from a male donor. Sample B is a colon cancer sample from a female donor. The sources of the methylation differences between sample A and B include: (1) gender differences; (2) pathological differences; (3) tissue differences. Samples A and B were mixed at five different titration ratios: 100:0, 90:10, 75:25, 50:50 and 0:100. The mixed samples were measured by Illumina Infinium HumanMethylation27 BeadChip with technical replicates. Please see the Methods section for a more detailed description.
As shown in Figure 1, the middle range of logistic transformation is approximately linear while the low and high ranges have clear nonlinear relationships between the Betavalue and Mvalue statistics. We have grouped the results of the transformations into three analysis groups, labeled as low, middle and high, with the middle analysis group corresponding to the approximately linear range and the low and high groups in the nonlinear range. This simplifies the analysis of the performance of each statistic.
Betavalue: low (0, 0.2), middle [0.2, 0.8] and high (0.8, 1).
Mvalue: low (Inf, 2), middle [2, 2] and high (2, Inf).
Define differentially methylated CpG sites based on correlation
If an examined CpG site has a significant methylation difference between Sample A and B, its methylation profile should be correlated with the titration profile shown in Table 1. Therefore, we can use the correlation between the methylation and titration profile to validate whether the CpG site is differentially methylated between Sample A and B. Following similar criteria used in the expression titration microarray experiments [16,17], we claim a CpG site is differentially methylated between Sample A and B if its absolute correlation coefficients between titration and methylation profiles are larger than 0.8 (correlation pvalue is about 0.05) both for Beta and Mvalue. There are 9845 investigated CpG sites satisfying this criterion. We treat them as True Positives (TP) to evaluate the performance of differential methylation analysis.
Table 1. Design of the methylation titration experiment
Performance comparison based on differential methylation analysis
One of the major statistical paradigms in expression microarray analysis has been the "Fold changeranking with a nonstringent pvalue cutoff" [1820]. Under this framework, the CpG islands will be first subject to a lowstringency pvalue threshold (p < 0.05 without the correction of multiple comparisons); and then ranked by fold changes. We hypothesized that Mvalue outperforms Betavalue under this statistical framework because Mvalue is more homoscedastic and therefore aligns better with the distribution assumptions of these statistical methods.
Following a similar logical framework, we first used a simple ttest to compare two technical replicates of Sample A and two technical replicates of Sample B, and require a differentially methylated CpG site to have pvalue < 0.05. We then separated these filtered CpG sites into the three analysis groups listed in the "Evaluation Dataset" subsection: low (2221 CpG sites for Betavalue; 2794 CpG sites for Mvalue), middle (6855 CpG sites for Betavalue; 6179 CpG sites for Mvalue) and high (457 CpG sites for Betavalue; 625 CpG sites for Mvalue) methylation analysis groups. In each analysis group, we sorted the CpG sites in decreasing order based on their absolute methylation difference between Sample A and B, i.e., , where represents the average methylation level of Sample A at i^{th }CpG site. We then evaluate the performance of each method by selecting the top N CpG sites as an evaluation set, with N starting at 50 and incremented in steps of 50 until all sites were included in the evaluation set. For each evaluation set (top N CpGsites), we calculated the True Positive Rate (TPR), where TPR was defined as the percentage of identified differentially methylated CpG sites being included in the True Positives (TP) set, i.e., TPR = TP∩CpG_{detected}/CpG_{detected}, where CpG_{detected }represents the CpG sites included in the evaluation set. We also calculated the Detection Rate (DR) for each evaluation set, where DR was defined as the percentage of detected TP CpG sites among all TP CpG sites, i.e., DR = TP∩CpG_{detected}/TP. Figure 4 shows the performance curves of Beta and Mvalue based on the relationship of 1  DR versus TPR. The definition of these curves is similar with the ROC (Receiver Operating Characteristic) curve. In an ideal situation, the best performance point is located at the left top corner in the figure, where both DR and TPR are equal to 1. Comparing the performance curve of Beta and Mvalue, we can see that the Mvalue statistic performs much better than Betavalue in the low and high methylation range. In the middle range, their performance is similar although the Betavalue has slightly higher DR while the Mvalue has better TPR.
Figure 4. Performance comparisons of Beta and Mvalue in the range of low, middle and high methylation levels based on the relationship of 1  Detection Rate versus True Positive Rate.
Refinement of the basic differential methylation analysis
Similar to other hybridization techniques, there is an inherent level of variability associated with sample preparation, sample loading, the microarrays and the detectors. To address this variability it is very common to add a "minimum difference threshold" to select out CpG sites with little difference between two biological conditions. Next we want to evaluate the performance of the Betavalue and Mvalue statistics if we include a minimum difference threshold in addition to the pvalue requirement.
After imposing a difference threshold, the identified differentially methylated CpG sites will have pvalues < 0.05 and have the mean methylation level difference between A and B samples larger than the difference threshold. Figure 5 plots TPR and DR against the methylation difference threshold for the Betavalue and Mvalue methods. In Figure 5, at the starting point (with thresholds of difference equal 0), there are 9533 and 9535 identified CpG sites across the entire methylation range for Beta and Mvalue, respectively. At the end point (with thresholds of difference equal 0.25 and 2.0 for Beta and Mvalue, respectively), there are 5231 and 5168 identified CpG sites for Beta and Mvalue, respectively. This indicates that the threshold ranges for Beta and Mvalue in Figure 5 are comparable. Figure 5 shows that TPR improves as the difference threshold increases but the DR decreases. The performance of Betavalue and Mvalue methods is very similar for the middle analysis group (covering the approximate linear range of logit transformation). However, the performance of these methods differs substantially for the nonlinear (high and low) analysis groups. For the Betavalue statistic, the TPR increases as the difference threshold increases but DR drops dramatically. For the Mvalue statistic, the TPR increases more slowly, but DR remains high for much larger difference thresholds.
Figure 5. Performance comparisons of Beta and Mvalue based on the True Positive Rate (TPR) and Detection Rate (DR) at different thresholds of methylation difference. (A) TPR versus threshold of difference of Betavalue; (B) TPR versus threshold of difference of Mvalue; (C) DR versus threshold of difference of Betavalue; (D) DR versus threshold of difference of Mvalue.
Figure 5 also provides some guidance for selecting the difference thresholds of Betavalue and Mvalue statistics. An ideal difference threshold would have both high TPR and high DR, but there is a tradeoff in selecting the threshold. From Figure 5, we can see that the TPR gradually increases with the difference threshold before stabilizing. Based on this, the difference threshold at the turning point of TPR can be set as the uplimit threshold because further increase of threshold will not improve TPR very much. On the other hand, the DR is almost constant at low thresholds and then gradually decreases with the increasing of difference threshold. So the difference threshold at the turning point of DR can be set as the downlimit threshold because it can increase the TPR without deteriorate the DR when DR is stabilized. Based on these guidelines, we suggest the range of threshold of Mvalue method should be about between 0.4 and 1.4 (or from 1.32 to 2.64 if we convert Mvalue to the nonlog scale). For the Betavalue method, because of its severe heteroscedasticity in the low and high analysis groups, it is infeasible to provide a fixed threshold. We can only suggest the threshold of Betavalue for the middle analysis group, which is about between 0.05 and 0.15. It should be noted that these threshold ranges are dependent on the distribution of intensities in the dataset so ideally these thresholds should be determined for each dataset.
Discussion
The Betavalue method has already been widely used to calculate methylation levels, and it is the manufacturer recommended method for analyzing Illumina Infinium HumanMethylation27 BeadChip microarrays. The Mvalue method has been widely used in the expression microarray analysis, and has been used to calculate methylation levels in some methylation microarray analyses [13]. However, to date there has been no systematic evaluation of the relationship between the Betavalue and Mvalue methods. In this study, we demonstrate that the two methods are related by a Logit transformation. They have an approximately linear relationship in the middle methylation range (defined as 0.2 to 0.8 for the Betavalue method) with a significant compression above and below this range for the Betavalue method. The Betavalue range is from 0 and 1 and can be interpreted as an approximation of the percentage of methlyation. However, because the Betavalue has a bounded range, this statistic violates the Gaussian distribution assumption used by many statistical methods, including the very prevalent ttest. In comparison, Mvalue statistic can be appropriately analyzed with these methods.
To compare the performance of Beta and Mvalue methods in identifying the differentially methylated CpG sites, we designed a methylation titration experiment. As we do not know the 'true' methylated CpG sites, we have defined a set of True Positives (TPs) based on high levels of correlation between the methylation and titration profiles. It is important to note that some true differentially methylated CpG sites may not be included in this set of TPs; at the same time, some false positives may also be included in the TPs. Fortunately, athough a small number of false positives or false negatives will affect the estimation of TPRs and DRs, but does not affect the overall performance comparisons between two methods (We did simulations by randomly adding or removing 10% TPs, and found the performance difference between Beta and Mvalues are consistent with the curves shown in Figure 4. The results were not included in the paper.). Comparing the performance based on top ranked CpG sites (ranked based on the absolute difference between two comparing groups), the Mvalue method has better detection power and a higher True Positive Rate (TPR) in the low and high methylation ranges due to its reduced heteroscedasticity in these ranges. In the middle methylation range, the Betavalue method has slightly better detection power than the Mvalue method but a decreased TPR.
In microarray differential analysis, adding a difference (or foldchange) threshold is another common practice and effective way to improve the TPR. However, due to the severe heteroscedasticity of the Betavalue method outside the middle methylation ranges, it is impossible to impose a constant difference threshold across entire methylation range for the Betavalue method. If a constant difference threshold is used for the Betavalue method, then the detection rate outside the middle methylation range is severely deteriorated. To solve this problem, Illumina proposed a customized model to detect differentially methylated CpG sites [21]. Basically, the model fits a parabola to the standard deviation as a function of Betavalue. However, this is inconvenient to implement, and the fitted parameters suggested by Illumina may change across different experiments under different conditions. Performing the same set of analyses using the Mvalue method demonstrates that using a constant difference threshold is appropriate and far easier to implement. Based on the comparison graphed in Figure 5 we suggest setting a threshold for the Mvalue method between 0.4 and 1.4 (or from 1.32 to 2.64 in the nonlog scale).
Conclusions
The Betavalue method has a direct biological interpretation  it corresponds roughly to the percentage of a site that is methylated. This makes the Betavalue very attractive when modeling the underlying biological effect. However, this interpretation is an approximation [22], especially when the data has not been properly preprocessed and normalized. From an analytical and statistical standpoint, the Betavalue method has severe heteroscedasticity outside the middle methylation range, which imposes serious challenges in applying many statistic models. In comparison, the Mvalue method is more statistically valid in differential and other statistic analysis as it is approximately homoscedastic. Although the Mvalue statistic does not have an intuitive biological meaning, it is possible to provide an accurate estimation of methylation status by modeling the distribution of the Mvalue statistic. In differential methylation analysis, we recommend using Mvalue because we can directly apply most statistical analysis methods designed for expression microarrays and it is easy to implement a difference threshold adjustment to improve the TPR. And the difference of Mvalue can be interpreted as the foldchange in the nonlog scale. Although both Betavalue and Mvalue methods have some limitations, the two statistics are interconvertible using Equation 3, enabling the use of the most appropriate method. We recommend using the Mvalue method for differential methylation analysis and also including the Betavalue statistic in final reports due to its intuitive biological interpretation.
Methods
Titration Samples
Similar to the titration design using Goldengate methylation chips by Bibikova and et al [12], we selected two samples with known methylation differences. Sample A is NA 10923 from Coriell Institute for Medical Research. It is a BLymphocyte sample from a male donor. Sample B is HTB38 cell line from ATCC (http://www.atcc.org webcite). It is a colon cancer sample from a female donor. Sample A and B were normalized into the same concentration, and then mixed in five different titration ratios. Table 1 shows the detailed information. The numbers in the row 2 and 3 in Table 1 are the percentage of sample A and B in the titration sample. Row 4 is the number of replicates of each sample.
DNA Methylation Profiling using Illumina Infinium BeadChip Microarrays
The DNA samples were prepared following the guidelines suggested by the manufacturer (Illumina, Inc.), and then measured by Illumina Infinium HumanMethylation27 BeadChip, which measures 27578 CpG sites. The HumanMethylation27 BeadChip contains a pair of methylated and unmethylated probes designed for each CpG site. All experiments were conducted following the manufacturer's protocols by the Genomics Core at Northwestern University. The Illumina BeadChips were scanned with an Illumina BeadArray Reader and then preprocessed by the Illumina GenomeStudio software. Raw data have been deposited in the NCBI GEO database under the accession number of GSE23789.
We used the Bioconductor methylumi package [23] to input the methylation files outputted by Illumina GenomeStudio software and processed the methylation data using Bioconductor lumi package [24]. The methylation data was first passed QC and color balance check, and then background corrected and scaled based on the mean of all probes (using methylation simple scaling normalization (SSN) implemented in the lumi package). Betavalue and Mvalue statistics were calculated based on Equation 1 and 2. The related preprocessing functions are included in the Bioconductor lumi package (version > 2.0) [24]. As a prefiltering step, 82 CpG sites with more than 50% of samples having detection pvalues worse than 0.0001 were filtered before the analysis. The Pearson correlation method was used to calculate the correlation between the titration and methylation profiles. Welch's ttest was used to identify the differentially methylated CpG sites.
Authors' contributions
PD and SML initialized the idea of this paper. PD conducted all data analysis and drafted the manuscript. LH and SML supervised the methylation project. CH participated all discussions of data analysis and manuscript revisions. SML, PD, LH and WAK designed the titration experiment. XZ performed the titration experiment. All authors participated in the project at different stages, discussed the results and commented on the manuscript. All authors read and approved the final manuscript.
Acknowledgements
We appreciate the very constructive critique and insightful comments of the reviewers. This work was supported in part by the NIH award 1RC1ES01846101 to LH. PD, SML and WAK acknowledge the support of P30CA060553 and UL1RR025741. We would like to thank Vivi Frangidakis for conducting the Illumina BeadChip experiments, Leming Shi for discussing the "FCranking" paradigm. We would also like to acknowledge other participants in the "DNA Methylation Alterations in Response to Pesticides Exposure" project meetings for their inputs and support: Hehuang Xie, Min Wang, Yue Yu and Marcelo Bento Soares.
References

Laird PW: Principles and challenges of genomewide DNA methylation analysis.
Nat Rev Genet 2010, 11(3):191203. PubMed Abstract  Publisher Full Text

Esteller M: CpG island hypermethylation and tumor suppressor genes: a booming present, a brighter future.
Oncogene 2002, 21(35):54275440. PubMed Abstract  Publisher Full Text

Herman JG, Baylin SB: Gene silencing in cancer in association with promoter hypermethylation.
N Engl J Med 2003, 349(21):20422054. PubMed Abstract  Publisher Full Text

Shen L, Kondo Y, Guo Y, Zhang J, Zhang L, Ahmed S, Shu J, Chen X, Waterland RA, Issa JP: Genomewide profiling of DNA methylation reveals a class of normally methylated CpG island promoters.
PLoS Genet 2007, 3(10):20232036. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

O'Riain C, O'Shea DM, Yang Y, Le Dieu R, Gribben JG, Summers K, YeboahAfari J, BhawRosun L, Fleischmann C, Mein CA, et al.: Arraybased DNA methylation profiling in follicular lymphoma.
Leukemia 2009, 23(10):18581866. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Breton CV, Byun HM, Wenten M, Pan F, Yang A, Gilliland FD: Prenatal tobacco smoke exposure affects global and genespecific DNA methylation.
Am J Respir Crit Care Med 2009, 180(5):462467. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Bell CG, Teschendorff AE, Rakyan VK, Maxwell AP, Beck S, Savage DA: Genomewide DNA methylation analysis for diabetic nephropathy in type 1 diabetes mellitus.
BMC Med Genomics 2010, 3:33. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Thirlwell C, Eymard M, Feber A, Teschendorff A, Pearce K, Lechner M, Widschwendter M, Beck S: Genomewide DNA methylation analysis of archival formalinfixed paraffinembedded tissue using the Illumina Infinium HumanMethylation27 BeadChip.
Methods 2010, 52(3):24854. PubMed Abstract  Publisher Full Text

Grafodatskaya D, Choufani S, Ferreira JC, Butcher DT, Lou Y, Zhao C, Scherer SW, Weksberg R: EBV transformation and cell culturing destabilizes DNA methylation in human lymphoblastoid cell lines.
Genomics 2010, 95(2):7383. PubMed Abstract  Publisher Full Text

Weisenberger DJ, Berg DVD, Pan F, Berman BP, Laird PW: Comprehensive DNA Methylation Analysis on the Illumina Infinium Assay Platform. [http://www.illumina.com/support/literature.ilmn] webcite

Bibikova M, Fan JB: GoldenGate assay for DNA methylation profiling.
Methods Mol Biol 2009, 507:149163. PubMed Abstract  Publisher Full Text

Bibikova M, Lin Z, Zhou L, Chudin E, Garcia EW, Wu B, Doucet D, Thomas NJ, Wang Y, Vollmer E, et al.: Highthroughput DNA methylation profiling using universal bead arrays.
Genome Res 2006, 16(3):383393. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Irizarry RA, LaddAcosta C, Carvalho B, Wu H, Brandenburg SA, Jeddeloh JA, Wen B, Feinberg AP: Comprehensive highthroughput arrays for relative methylation (CHARM).
Genome Res 2008, 18(5):780790. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Houseman EA, Christensen BC, Yeh RF, Marsit CJ, Karagas MR, Wrensch M, Nelson HH, Wiemels J, Zheng S, Wiencke JK, et al.: Modelbased clustering of DNA methylation array data: a recursivepartitioning algorithm for highdimensional data arising as a mixture of beta distributions.
BMC Bioinformatics 2008, 9:365. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Durbin BP, Hardin JS, Hawkins DM, Rocke DM: A variancestabilizing transformation for geneexpression microarray data.
Bioinformatics 2002, 18(Suppl 1):S105110. PubMed Abstract  Publisher Full Text

Lin SM, Du P, Huber W, Kibbe WA: Modelbased variancestabilizing transformation for Illumina microarray data.
Nucleic Acids Res 2008, 36(2):e11. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Barnes M, Freudenberg J, Thompson S, Aronow B, Pavlidis P: Experimental comparison and crossvalidation of the Affymetrix and Illumina gene expression analysis platforms.
Nucleic Acids Res 2005, 33(18):59145923. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ, de Longueville F, Kawasaki ES, Lee KY, et al.: The MicroArray Quality Control (MAQC) project shows inter and intraplatform reproducibility of gene expression measurements.
Nat Biotechnol 2006, 24(9):11511161. PubMed Abstract  Publisher Full Text

Guo L, Lobenhofer EK, Wang C, Shippy R, Harris SC, Zhang L, Mei N, Chen T, Herman D, Goodsaid FM, et al.: Rat toxicogenomic study reveals analytical consistency across microarray platforms.
Nat Biotechnol 2006, 24(9):11621169. PubMed Abstract  Publisher Full Text

Shi L, Jones WD, Jensen RV, Harris SC, Perkins RG, Goodsaid FM, Guo L, Croner LJ, Boysen C, Fang H, et al.: The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies.
BMC Bioinformatics 2008, 9(Suppl 9):S10. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Illumina: GenomeStudio Methylation Module v1.0 User Guide. [http://www.illumina.com/support/documentation.ilmn] webcite

Illumina: GoldenGate Assay for Methylation and BeadArray Technology. [http:/ / www.illumina.com/ Documents/ products/ technotes/ technote_goldengate_assay_methylati on.pdf] webcite

Davis S, Bilke S: methylumi: Handle Illumina methylation data.

Du P, Kibbe WA, Lin SM: lumi: a pipeline for processing Illumina microarray.
Bioinformatics 2008, 24(13):15471548. PubMed Abstract  Publisher Full Text