This article is part of the supplement: Selected articles from the IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS) 2011

Open Access Open Badges Research

Methods for high-throughput MethylCap-Seq data analysis

Benjamin AT Rodriguez1, David Frankhouser1, Mark Murphy1, Michael Trimarchi1, Hok-Hei Tam1, John Curfman1, Rita Huang2, Michael WY Chan3, Hung-Cheng Lai2, Deval Parikh1, Bryan Ball1, Sebastian Schwind1, William Blum1, Guido Marcucci1, Pearlly Yan1* and Ralf Bundschuh4*

Author affiliations

1 The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, USA

2 Graduate Institute of Medical Sciences, Department of Obstetrics and Gynecology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan

3 Department of Life Science, National Chung Cheng University, Min-Hsiung, Chia-Yi, Taiwan

4 Departments of Physics and Biochemistry, Center for RNA Biology, The Ohio State University, Columbus, Ohio, USA

For all author emails, please log on.

Citation and License

BMC Genomics 2012, 13(Suppl 6):S14  doi:10.1186/1471-2164-13-S6-S14

Published: 26 October 2012



Advances in whole genome profiling have revolutionized the cancer research field, but at the same time have raised new bioinformatics challenges. For next generation sequencing (NGS), these include data storage, computational costs, sequence processing and alignment, delineating appropriate statistical measures, and data visualization. Currently there is a lack of workflows for efficient analysis of large, MethylCap-seq datasets containing multiple sample groups.


The NGS application MethylCap-seq involves the in vitro capture of methylated DNA and subsequent analysis of enriched fragments by massively parallel sequencing. The workflow we describe performs MethylCap-seq experimental Quality Control (QC), sequence file processing and alignment, differential methylation analysis of multiple biological groups, hierarchical clustering, assessment of genome-wide methylation patterns, and preparation of files for data visualization.


Here, we present a scalable, flexible workflow for MethylCap-seq QC, secondary data analysis, tertiary analysis of multiple experimental groups, and data visualization. We demonstrate the experimental QC procedure with results from a large ovarian cancer study dataset and propose parameters which can identify problematic experiments. Promoter methylation profiling and hierarchical clustering analyses are demonstrated for four groups of acute myeloid leukemia (AML) patients. We propose a Global Methylation Indicator (GMI) function to assess genome-wide changes in methylation patterns between experimental groups. We also show how the workflow facilitates data visualization in a web browser with the application Anno-J.


This workflow and its suite of features will assist biologists in conducting methylation profiling projects and facilitate meaningful biological interpretation.