Table 1

Results for exact and approximate tag sequence matching

Tag sequence

Library

Reads matching with # Mismatches


0

1

2

3

4

5

> 5


5'-end

LIB019

38,271

(89.37)

2,253

(5.26)

564

(1.32)

185

(0.43)

50

(0.12)

64

(0.15)

1,438

(3.36)


LIB020

14,491

(84.60)

1,629

(9.51)

430

(2.51)

165

(0.96)

31

(0.18)

24

(0.14)

359

(2.10)


LIB021

41,764

(84.74)

4,748

(9.63)

1,345

(2.73)

427

(0.87)

125

(0.25)

111

(0.23)

762

(1.55)


3'-end

LIB019

7,194

(16.80)

12,156

(28.39)

2,454

(5.73)

688

(1.61)

683

(1.59)

766

(1.79)

18,884

(44.10)


LIB020

2,855

(16.67)

2,460

(14.36)

561

(3.28)

279

(1.63)

275

(1.61)

904

(5.28)

9,795

(57.18)


LIB021

7,981

(16.19)

6,924

(14.05)

1,800

(3.65)

942

(1.91)

908

(1.84)

2,480

(5.03)

28,247

(57.32)


Concatenated

LIB019

931

(2.17)

282

(0.66)

132

(0.31)

51

(0.12)

104

(0.24)

32

(0.07)

-


LIB020

185

(1.08)

45

(0.26)

19

(0.11)

12

(0.07)

17

(0.10)

8

(0.05)

-


LIB021

1,302

(2.64)

464

(0.94)

215

(0.44)

120

(0.24)

135

(0.27)

30

(0.06)

-


Results for the 5'-end tag sequence (5'-GTG GTG TGT TGG GTG TGT TTG GNN NNN NNN N; Length: 31 bp; matching within 46 bp), 3'-end tag sequence (NNN NNN NNN CCA AAC ACA CCC AAC ACA CCA-3'; Length: 30 bp; matching within 45 bp) and the concatenated tag sequences (Length: 61 bp). Note that the numbers are based on the dereplicated datasets. Percentages are shown in parenthesis.

Schmieder et al. BMC Bioinformatics 2010 11:341   doi:10.1186/1471-2105-11-341

Open Data