QTrim: a novel tool for the quality trimming of sequence reads generated using the Roche/454 sequencing platform
1 South African National Bioinformatics Institute, SA MRC Bioinformatics Unit, University of the Western Cape, Private Bag X17, Bellville 7535, South Africa
2 Ryan Institute, School of Natural Sciences, National University of Ireland Galway, Galway, Ireland
3 Retroviruses and Molecular Evolution, CNRS-UPR 9002, Institut de Biologie Moléculaire et Cellulaire, 15 rue René Descartes, Strasbourg 67084, France
4 School of Life Sciences, University of Warwick, Gibbet Hill Rd, Coventry CV4 7AL, UK
BMC Bioinformatics 2014, 15:33 doi:10.1186/1471-2105-15-33Published: 30 January 2014
Many high throughput sequencing (HTS) approaches, such as the Roche/454 platform, produce sequences in which the quality of the sequence (as measured by a Phred-like quality scores) decreases linearly across a sequence read. Undertaking quality trimming of this data is essential to enable confidence in the results of subsequent downstream analysis. Here, we have developed a novel, highly sensitive and accurate approach (QTrim) for the quality trimming of sequence reads generated using the Roche/454 sequencing platform (or any platform with long reads that outputs Phred-like quality scores).
The performance of QTrim was evaluated against all other available quality trimming approaches on both poor and high quality 454 sequence data. In all cases, QTrim appears to perform equally as well as the best other approach (PRINSEQ) with these two methods significantly outperforming all other methods. Further analysis of the trimmed data revealed that the novel trimming approach implemented in QTrim ensures that the prevalence of low quality bases in the resulting trimmed data is substantially lower than PRINSEQ or any of the other approaches tested.
QTrim is a novel, highly sensitive and accurate algorithm for the quality trimming of Roche/454 sequence reads. It is implemented both as an executable program that can be integrated with standalone sequence analysis pipelines and as a web-based application to enable individuals with little or no bioinformatics experience to quality trim their sequence data.