ReQON: a Bioconductor package for recalibrating quality scores from next-generation sequencing data
1 Department of Statistics and Operations Research, Chapel Hill, NC, USA
2 Renaissance Computing Center, Chapel Hill, NC, USA
3 Lineberger Comprehensive Cancer Center, Chapel Hill, NC, USA
4 Department of Genetics, Chapel Hill, NC, USA
5 Department of Internal Medicine, Division of Medical Oncology, Multidisciplinary Thoracic Oncology Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
BMC Bioinformatics 2012, 13:221 doi:10.1186/1471-2105-13-221Published: 4 September 2012
Next-generation sequencing technologies have become important tools for genome-wide studies. However, the quality scores that are assigned to each base have been shown to be inaccurate. If the quality scores are used in downstream analyses, these inaccuracies can have a significant impact on the results.
Here we present ReQON, a tool that recalibrates the base quality scores from an input BAM file of aligned sequencing data using logistic regression. ReQON also generates diagnostic plots showing the effectiveness of the recalibration. We show that ReQON produces quality scores that are both more accurate, in the sense that they more closely correspond to the probability of a sequencing error, and do a better job of discriminating between sequencing errors and non-errors than the original quality scores. We also compare ReQON to other available recalibration tools and show that ReQON is less biased and performs favorably in terms of quality score accuracy.
ReQON is an open source software package, written in R and available through Bioconductor, for recalibrating base quality scores for next-generation sequencing data. ReQON produces a new BAM file with more accurate quality scores, which can improve the results of downstream analysis, and produces several diagnostic plots showing the effectiveness of the recalibration.