BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data
1 Center for Synthetic & Systems Biology, TNLIST, Tsinghua University, Beijing 100084, China
2 Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, CA 90095, USA
3 Department of Biological Chemistry, University of California, Los Angeles, CA 90095, USA
4 Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA
5 Zymo Research Corp, 17062 Murphy Ave, Irvine, CA 92614, USA
6 Department of Molecular and Cell Biology, Center for Systems Biology, The University of Texas at Dallas, Richardson, TX 75080, USA
7 Institute of Plant and Microbial Biology, Academia Sinica, Taipei 11529, Taiwan
8 Institute for Genomics and Proteomics, University of California, Los Angeles, CA 90095, USA
BMC Genomics 2013, 14:774 doi:10.1186/1471-2164-14-774Published: 10 November 2013
DNA methylation is an important epigenetic modification involved in many biological processes. Bisulfite treatment coupled with high-throughput sequencing provides an effective approach for studying genome-wide DNA methylation at base resolution. Libraries such as whole genome bisulfite sequencing (WGBS) and reduced represented bisulfite sequencing (RRBS) are widely used for generating DNA methylomes, demanding efficient and versatile tools for aligning bisulfite sequencing data.
We have developed BS-Seeker2, an updated version of BS Seeker, as a full pipeline for mapping bisulfite sequencing data and generating DNA methylomes. BS-Seeker2 improves mappability over existing aligners by using local alignment. It can also map reads from RRBS library by building special indexes with improved efficiency and accuracy. Moreover, BS-Seeker2 provides additional function for filtering out reads with incomplete bisulfite conversion, which is useful in minimizing the overestimation of DNA methylation levels. We also defined CGmap and ATCGmap file formats for full representations of DNA methylomes, as part of the outputs of BS-Seeker2 pipeline together with BAM and WIG files.
Our evaluations on the performance show that BS-Seeker2 works efficiently and accurately for both WGBS data and RRBS data. BS-Seeker2 is freely available at http://pellegrini.mcdb.ucla.edu/BS_Seeker2/ webcite and the Galaxy server.