Sequencing error correction without a reference genome
1 Australian Centre for Plant Functional Genomics, The University of Adelaide, Urrbrae, SA 5064, Australia
2 Phenomics and Bioinformatics Research Centre, University of South Australia, Mawson Lakes, SA 5095, Australia
3 ACRF South Australian Cancer Genome Facility, Centre for Cancer Biology, SA Pathology, Adelaide, SA 5000, Australia
4 School of Molecular and Biomedical Science, University of Adelaide, Adelaide, SA 5000, Australia
BMC Bioinformatics 2013, 14:367 doi:10.1186/1471-2105-14-367Published: 18 December 2013
Additional file 1:
Connected subgraph of sequences. A connected subgraph of sequences of length 21 from an Illumina HiSeq data set. The most abundant sequence in this subgraph occurred 45,484 times and is represented by the largest node (filled circle).
Format: EPS Size: 157KB Download file
Additional file 2:
A larger connected subgraph. A connected subgraph of sequences of length 21 from an Illumina GA data set. The most abundant sequence in this subgraph occurred 165,504 times in the data set. The size of the nodes (filled circles) representing each sequence is proportional to their abundance. The edges connect sequences that vary in one position only.
Format: EPS Size: 464KB Download file