Table 3

Gene Prediction Performance

Organism

%GC

Verified

Prodigal 1.20

Prodigal 1.20+TriTisa

Prodigal 1.20+TiCo

GeneMarkHMM 2.6

EasyGene 1.2

Glimmer 3.02

MED 2.0


Escherichia coli K12

50.8

884

884/853

(100%/96.5%)

884/840

(100.0%/95.0%)

884/843

(100.0%/95.4%)

882/835

(99.8%/94.5%)

880/809

(99.5%/91.5%)

880/804

(99.6%/91.0%)

875/810

(99.0%/91.6%)


Halobacterium salinarum

68.0

550

549/533

(99.8%/96.9%)

549/525

(99.8%/95.5%)

549/520

(99.8%/94.6%)

548/510

(99.6%/92.7%)

544/494

(98.9%/89.8%)

549/478

(99.8%/86.9%)

531/418

(96.6%/76.0%)


Natronomonas pharaonis

63.4

321

320/314

(99.7%/97.8%)

320/314

(99.7%/97.8%)

320/313

(99.7%/97.5%)

321/307

(100%/95.6%)

314/300

(97.8%/93.5%)

320/304

(99.7%/94.7%)

315/265

(98.1%/82.6%)


Bacillus subtilis

43.5

148

148/144

(100%/97.3%)

148/145

(100.0%/98.0%)

148/144

(100.0%/97.3%)

147/145

(99.3%/98.0%)

144/139

(97.3%/93.9%)

144/140

(97.3%/94.6%)

146/142

(98.7%/96.0%)


Aeropyrum pernix

56.3

131

131/128

(100%/97.7%)

131/127

(100.0%/97.0%)

131/128

(100.0%/97.7%)

130/123

(99.2%/93.9%)

130/124

(99.2%/94.7%)

130/121

(99.2%/92.4%)

131/116

(100%/88.6%)


Synechocystis PCC6803

47.8

102

102/99

(100%/97.0%)

102/98

(100%/96.1%)

102/93

(100%/91.2%)

102/92

(100%/90.2%)

101/87

(99.0%/85.3%)

102/84

(100%/82.4%)

100/88

(98.0%/86.3%)


Pseudomonas aeruginosa

66.6

122

118/116

(96.7%/95.1%)

118/113

(96.7%/92.6%)

118/115

(96.7%/94.3%)

115/105

(94.3%/86.1%)

122/112

(100%/91.8%)

120/113

(98.4%/92.6%)

117/113

(95.9%/92.6%)


Mycobacterium tuberculosis H37Rv

65.6

62

62/58

(100%/93.6%)

62/58

(100%/93.6%)

62/57

(100%/91.9%)

61/54

(98.4%/87.1%)

62/58

(100%/93.6%)

61/55

(98.4%/88.7%)

60/56

(96.8%/90.3%)


Haemophilus influenzae

38.2

67

67/66

(100%/98.5%)

67/67

(100%/100%)

67/67

(100%/100%)

67/65

(100%/97.0%)

67/67

(100%/100%)

67/65

(100%/97.0%)

66/65

(98.5%/97.0%)


Sulfolobus solfataricus

35.8

56

56/51

(100%/91.1%)

56/49

(100%/87.5%)

56/49

(100%/87.5%)

56/48

(100%/85.7%)

56/51

(100%/91.1%)

56/49

(100%/87.5%)

56/50

(100%/89.3%)


All Genomes

---

2443

2437/2362

(99.8%/96.7%)

2437/2336

(99.8%/95.6%)

2437/2329

(99.8%/95.3%)

2429/2284

(99.4%/93.5%)

2420/2241

(99.1%/91.7%)

2429/2213

(99.4%/90.6%)

2397/2123

(98.1%/86.9%)


Table 3 shows the performance of gene-finding algorithms on ten sets of experimentally verified genes with experimentally verified translation initiation sites. The first number in each entry indicates the number of 3' ends of genes correctly identified. The second number in each entry indicates the number of 5'+3' ends (genes and their correct starts) exactly identified. Beneath these numbers are % representations for each of those values. The final row shows the performance over the entire set of organisms.

Hyatt et al. BMC Bioinformatics 2010 11:119   doi:10.1186/1471-2105-11-119

Open Data