|
Resolution: standard / high Figure 1.
Use of finishing reads. Two algorithms for assembling shotgun reads and finishing reads. The control treats
both read types equally. The bounded algorithm attempts to assemble finishing reads
consistently with their bounding constraints. For each algorithm, the figure shows
its construction of a scaffold from contigs (rectangles) with 2X in shotgun reads
(black lines). Each finishing read (colored line) has a corresponding pair of PCR
primer sites (arrows of same color). External to the scaffold is a unitig (grey area)
deemed repetitive due to high coverage. (a) A mate pair constraint (curve) localizes one read and the unitig to this gap. Nevertheless,
the control algorithm cannot tile this gap with reads. The bounded algorithm localizes
two finishing reads by their primer sites. The bounded algorithm does tile the gap
with reads, enabling a more accurate consensus sequence. (b) The control cannot localize the unitig or any reads to this gap. It does not close
the gap. The bounded algorithm localizes the unitig by finishing reads and their primer
sites. It tiles the gap with finishing reads from the unitig. (c) Both algorithms assemble finishing reads from a gap that is not a genomic repeat.
In our data sets, most finishing reads fit gaps of this type.
Koren et al. BMC Bioinformatics 2010 11:457 doi:10.1186/1471-2105-11-457 |