Table 2

Summary of our methods for lengths 5, 21, and 10 to refer to 1- and 2-mismatch and 1- and 2-gap sequences

length

condition

#keys

#words

ratio

f(s, K) when s = c(s)

f(s, K) when c ≠ c(s)


5

1-mismatch

6.625

16

41.4%

1 + 15x

1 + 15x + 42x2 + 54x3

2-mismatches

27.25

106

25.7%

1 + 15 + 90x2 + 210x3 + 180x4

1 + 15 + 90x2 + 170x3 + 156x4

1-gap

3.25

4

81.3%

4 + 12x

4 + 60x

2-gaps

10

16

62.5%

16 + 36x + 108x2

– ∗1


21

1-mismatch

30.53

64

47.7%

1 + 63x

1 + 63x + 210x2 + 1710x3

2-mismatches

611.31

1954

31.3%

1 + 63x + 1890x2 + 4410x3 + 34020x4

1 + 63x + 1890x2 + 5650x3 + 31500x4

1-gap

3.81

4

95.3%

4 + 60x

4 + 252x

2-gaps

13.87

16

86.7%

16 + 84x + 540x2

16 + 48x + 960x2


10: Serialize

1-mismatch

12.25

31

39.5%

1 + 30x + 225x2

1 + 30x + 170x2 + 538x3 + 1089x4 + 1620x52

10: Parallelize

1-mismatch

13.25

31

44.1%

1 + 30x

1 + 30x + 84x2 + 108x33


1 :s always includes one code word. ∗2: neither the first half nor the second half are code words. The reference formula when one of the two halves is a code word is 1 + 30x2 + 267x2 + 684x3 + 810x4. ∗3: neither the first half or second half are code words. The reference formula when one of the two halves is a code word is 1 + 30x2 + 42x2 + 54x3.

Takenaka et al. BMC Genomics 2011 12(Suppl 3):S8   doi:10.1186/1471-2164-12-S3-S8

Open Data