Table 1

Read error correction frequencies

Quality (Geometric Mean)

0.9 - 1.0

0.6 - 0.9


Error-free Input and Output

544 669

21 095

All Errors Retained

4 023

4 675

Input Errors Reduced

50 082

27 668

Errors Introduced

0

37


Total

598 774

53 475


Summary of error frequencies in assembled Illumina paired-end reads generated from sequenced V3-region amplicons of Methylococcus capsulatus strain Bath.

All error data were analyzed solely within the region of overlap, which was relevant to PANDAseq assembly. Low-abundance "contamination" was observed in the dataset (data not shown), possibly due to reagents used for PCR. These will contribute to the counts of sequences that had errors that were retained. This category will also contain sequences in which both reads contain low-quality bases with quality scores masked by CASAVA.

Masella et al. BMC Bioinformatics 2012 13:31   doi:10.1186/1471-2105-13-31

Open Data