This article is part of the supplement: Ninth International Conference on Bioinformatics (InCoB2010): Computational Biology
NGSQC: cross-platform quality analysis pipeline for deep sequencing data
1 Department of Psychiatry and Molecular and Behavioral Neuroscience Institute, University of Michigan, Ann Arbor, MI 48109, USA
2 Michigan Center for Translational Pathology, Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
3 Division of Infectious Diseases, Department of Internal Medicine, University of Michigan, Ann Arbor, MI 48109, USA
4 Center for Computational Medicine and Biology, University of Michigan, Ann Arbor, MI 48109, USA
5 National Center for Integrative Biomedical Informatics, University of Michigan, Ann Arbor, MI 48109, USA
Citation and License
BMC Genomics 2010, 11(Suppl 4):S7 doi:10.1186/1471-2164-11-S4-S7Published: 2 December 2010
While the accuracy and precision of deep sequencing data is significantly better than those obtained by the earlier generation of hybridization-based high throughput technologies, the digital nature of deep sequencing output often leads to unwarranted confidence in their reliability.
The NGSQC (Next Generation Sequencing Quality Control) pipeline provides a set of novel quality control measures for quickly detecting a wide variety of quality issues in deep sequencing data derived from two dimensional surfaces, regardless of the assay technology used. It also enables researchers to determine whether sequencing data related to their most interesting biological discoveries are caused by sequencing quality issues.
Next generation sequencing platforms have their own share of quality issues and there can be significant lab-to-lab, batch-to-batch and even within chip/slide variations. NGSQC can help to ensure that biological conclusions, in particular those based on relatively rare sequence alterations, are not caused by low quality sequencing.