Table 1

Real data sets

Reference genome Source

Reference genome size

SRA accession and species

Total reads

Average read length


S. pneumoniae

ATCC 700669 [GenBank: FM211187]

≈2.2 Mbp

○ SRR001327 S. pneumoniae CDC1873-00

○ SRR001328 S. pneumoniae SP195

○ SRR001329 S. pneumoniae CDC0288-04

646,724

253


E. coli 0127:H6 E2348/69 [GenBank: FM180568.1]

≈4.96 Mbp

○ SRR000868 E. coli K-12

○ SRR000870 E. coli K-12

○ SRR031369 E. coli ETEC WS3080A

○ SRR031370 E. coli ETEC TW03576

588,397

263


P. falciparum 3D7 PlasmoDB rel 7.0

≈23.3 Mbp

○ SRR006911 P. falciparum 3D7

○ SRR006912 P. falciparum 3D7

○ SRR006913 P. falciparum 3D7

○ SRR006914 P. falciparum 3D7

○ SRR006915 P. falciparum 3D7

203,196

223


C. elegans

WormDB rel.

WS210

≈103 Mbp

○ SRR022943 C. elegans Lynch MA41 mutation-accumulation line derived from N2.

3,214,353

103


D. pseudoobscura FlyBase rel. 2.14

≈150 Mbp

○ SRR003807 D. pseudoobscura Flagstaff 1993

○ SRR014458 D. pseudoobscura bogotana ER (white)

○ SRR014459 D. pseudoobscura bogotana ER (white)

○ SRR014460 D. miranda strain Mather 1993

834,659

239


H. sapiens Chr. 15 ENSEMBL ver. GRCh37

≈100 Mbp

○ SRR014420 Human individual NA15510

○ SRR014421 Human individual NA15510

○ SRR014422 Human individual NA15510

○ SRR014423 Human individual NA15510

○ SRR014424 Human individual NA15510

○ SRR014425 Human individual NA15510

3,204

212


Biological data sets used for the evaluation of the algorithms. The read data sets were downloaded from the Sequence Read Archive (SRA) public repository.

Fernandes et al. BMC Bioinformatics 2011 12:163   doi:10.1186/1471-2105-12-163

Open Data