Open Access Highly Accessed Software

BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data

Weilong Guo12, Petko Fiziev3, Weihong Yan4, Shawn Cokus2, Xueguang Sun5, Michael Q Zhang16, Pao-Yang Chen7* and Matteo Pellegrini28*

Author Affiliations

1 Center for Synthetic & Systems Biology, TNLIST, Tsinghua University, Beijing 100084, China

2 Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, CA 90095, USA

3 Department of Biological Chemistry, University of California, Los Angeles, CA 90095, USA

4 Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA

5 Zymo Research Corp, 17062 Murphy Ave, Irvine, CA 92614, USA

6 Department of Molecular and Cell Biology, Center for Systems Biology, The University of Texas at Dallas, Richardson, TX 75080, USA

7 Institute of Plant and Microbial Biology, Academia Sinica, Taipei 11529, Taiwan

8 Institute for Genomics and Proteomics, University of California, Los Angeles, CA 90095, USA

For all author emails, please log on.

BMC Genomics 2013, 14:774  doi:10.1186/1471-2164-14-774

Published: 10 November 2013

Abstract

Background

DNA methylation is an important epigenetic modification involved in many biological processes. Bisulfite treatment coupled with high-throughput sequencing provides an effective approach for studying genome-wide DNA methylation at base resolution. Libraries such as whole genome bisulfite sequencing (WGBS) and reduced represented bisulfite sequencing (RRBS) are widely used for generating DNA methylomes, demanding efficient and versatile tools for aligning bisulfite sequencing data.

Results

We have developed BS-Seeker2, an updated version of BS Seeker, as a full pipeline for mapping bisulfite sequencing data and generating DNA methylomes. BS-Seeker2 improves mappability over existing aligners by using local alignment. It can also map reads from RRBS library by building special indexes with improved efficiency and accuracy. Moreover, BS-Seeker2 provides additional function for filtering out reads with incomplete bisulfite conversion, which is useful in minimizing the overestimation of DNA methylation levels. We also defined CGmap and ATCGmap file formats for full representations of DNA methylomes, as part of the outputs of BS-Seeker2 pipeline together with BAM and WIG files.

Conclusions

Our evaluations on the performance show that BS-Seeker2 works efficiently and accurately for both WGBS data and RRBS data. BS-Seeker2 is freely available at http://pellegrini.mcdb.ucla.edu/BS_Seeker2/ webcite and the Galaxy server.

Keywords:
DNA methylation; Bisulfite sequencing aligner; WGBS; RRBS; BS Seeker; Bisulfite conversion failure; Galaxy toolshed