Open Access Highly Accessed Open Badges Research article

DNA methylation arrays as surrogate measures of cell mixture distribution

Eugene Andres Houseman1*, William P Accomando2, Devin C Koestler3, Brock C Christensen3, Carmen J Marsit3, Heather H Nelson4, John K Wiencke5 and Karl T Kelsey26

Author affiliations

1 College of Public Health and Human Sciences, Oregon State University, Corvallis, OR 97331, USA

2 Department of Pathology and Laboratory Medicine, Brown University, Providence, RI 02912, USA

3 Section of Biostatistics and Epidemiology, Dartmouth Medical School, Hanover, NH 03755, USA

4 Department of Epidemiology, University of Minnesota, Minneapolis, MN 55455, USA

5 Department of Neurological Surgery, University of California San Francisco, San Francisco, CA 94158, USA

6 Department of Epidemiology, Brown University, Providence, RI 02912, USA

For all author emails, please log on.

Citation and License

BMC Bioinformatics 2012, 13:86  doi:10.1186/1471-2105-13-86

Published: 8 May 2012



There has been a long-standing need in biomedical research for a method that quantifies the normally mixed composition of leukocytes beyond what is possible by simple histological or flow cytometric assessments. The latter is restricted by the labile nature of protein epitopes, requirements for cell processing, and timely cell analysis. In a diverse array of diseases and following numerous immune-toxic exposures, leukocyte composition will critically inform the underlying immuno-biology to most chronic medical conditions. Emerging research demonstrates that DNA methylation is responsible for cellular differentiation, and when measured in whole peripheral blood, serves to distinguish cancer cases from controls.


Here we present a method, similar to regression calibration, for inferring changes in the distribution of white blood cells between different subpopulations (e.g. cases and controls) using DNA methylation signatures, in combination with a previously obtained external validation set consisting of signatures from purified leukocyte samples. We validate the fundamental idea in a cell mixture reconstruction experiment, then demonstrate our method on DNA methylation data sets from several studies, including data from a Head and Neck Squamous Cell Carcinoma (HNSCC) study and an ovarian cancer study. Our method produces results consistent with prior biological findings, thereby validating the approach.


Our method, in combination with an appropriate external validation set, promises new opportunities for large-scale immunological studies of both disease states and noxious exposures.