Table 4

Comparison with Genbank Annotations

Organism

Genbank Genes with no Joins

Prodigal 1.20

Prodigal 1.20+TiCo

Prodigal 1.20+TriTisa

GenemarkHMM 2.6

Glimmer 3.02

EasyGene 1.2

MED 2.0


Escherichia coli K12

4268

4118/3823

(96.5%/89.6%)

4118/3779

(96.5%/88.5%)

4118/3778

(96.5%/88.5%)

4122/3685

(96.6%/86.3%)

4076/3563

(95.5%/83.5%)

3977/3565

(93.2%/83.5%)

4102/3711

(96.1%/86.9%)


Halobacterium salinarum

2110

2062/1857

(97.7%/88.0%)

2062/1809

(97.7%/85.7%)

2061/1790

(97.6%/84.8%)

2042/1676

(96.7%/79.4%)

2054/1609

(97.3%/76.2%)

2018/1692

(95.6%/80.2%)

2008/1469

(95.1%/69.6%)


Natronomonas pharaonis

2661

2630/2398

(98.8%/90.1%)

2630/2358

(98.8%/88.6%)

2630/2348

(98.8%/88.2%)

2624/2251

(98.6%/84.6%)

2622/2220

(98.5%/83.4%)

2548/2271

(95.7%/85.3%)

2586/1953

(97.2%/73.4%)


Bacillus subtilis

4174

4113/3705

(98.5%/88.8%)

4113/3678

(98.5%/88.1%)

4113/3679

(98.5%/88.1%)

4136/3713

(99.1%/89.0%)

4102/3569

(98.3%/85.5%)

3977/3578

(95.3%/85.7%)

4127/3596

(98.9%/86.2%)


Aeropyrum pernix

1699

1670/1430

(98.3%/84.2%)

1670/1363

(98.3%/80.2%)

1670/1353

(98.3%/79.6%)

1672/1364

(98.4%/80.3%)

1671/1317

(98.4%/77.5%)

1652/1389

(97.2%/81.8%)

1689/1309

(99.4%/77.1%)


Synechocystis PCC6803

3171

3146/2587

(99.2%/81.6%)

3146/2364

(99.2%/74.6%)

3146/2447

(99.2%/77.2%)

3124/2337

(98.5%/73.7%)

3123/2236

(98.5%/70.5%)

3053/2288

(96.3%/72.2%)

3126/2192

(98.6%/69.1%)


Pseudomonas aeruginosa

5565

5514/5038

(99.1%/90.5%)

5514/4885

(99.1%/87.8%)

5514/4821

(99.1%/86.6%)

5484/4698

(98.5%/84.4%)

5491/4705

(98.7%/84.5%)

5522/4761

(99.2%/85.5%)

5292/4539

(95.1%/81.6%)


Table 4 shows the performance of gene-finding algorithms on seven Genbank files. The first number in each entry indicates the number of 3' ends of genes correctly identified. The second number in each entry indicates the number of 5'+3' ends (genes and their correct starts) exactly identified. Beneath these numbers are % representations for each of those values. It should be noted that Genbank genes are not experimentally verified; this table is just meant to provide a snapshot of performance over entire genomes.

Hyatt et al. BMC Bioinformatics 2010 11:119   doi:10.1186/1471-2105-11-119

Open Data