Table 1

Results using the bounded read placement algorithm.

Placed finishing reads

Gaps closed

Species

Candidate gaps

# Bounds

# Finishing reads

Control

Bounded

Alternate

Control

Bounded

Alternate







E. coli O157:H7

14 (1)

56

128

26

92

N/A

0

11

N/A

S. enterica

2 (0)

18

33

14

23

N/A

0

1

N/A

B. mallei

9 (0)

23

40

4

27

N/A

0

4

N/A

I. multifiliis

11 (2)

14

21

3

21

17

0

6

4

E. coli K12

49 (2)

23

60

12

49

11

0

29

0

C. amycolatum

4 (0)

3

3

0

2

0

0

1

0


Total

89 (5)

137

285

59

214

28

0

52

4


Comparison of three algorithms. Control uses finishing reads like WGS reads. Bounded uses finishing reads with placement constraints. Alternate uses finishing reads in a second round of assembly without constraints. Candidate gaps include both regions in the control assembly between finishing constraints with zero coverage and a consensus sequence derived from a repeat unitig or no consensus sequence in the control assembly. The parentheses indicate the number of gaps with no consensus sequence in the control assembly. The gap and spanning constraint are not necessarily 1-to-1. Bounds: The total number of bounding constraints that span the repeat gap or were not satisfied in both control and bounded assemblies. Finishing reads: The total number of finishing reads generated for the bounds in the table. Placed finishing reads: The total number of finishing reads placed in the assembly by each of the assembly algorithms. Gaps closed: The number of gaps closed by filling in missing consensus sequence or by tiling repeat instances with reads. By definition, the control assembly always closes 0 gaps. The bounded assembly joins were verified by alignment to finished reference, where available.

Koren et al. BMC Bioinformatics 2010 11:457   doi:10.1186/1471-2105-11-457

Open Data