Table 4

Benchmarking FOSTA against the refined Hulsen et al. dataset

Protein family
Refined
(TO)
Basic statistics
Evaluation statistics



TP
FP
TN
FN
PPV
MCC

HBB
2
(9)
2
0
17
0
100.00
1.00
HOX
30
(41)
30
0
3853
0
100.00
1.00
SMm
12
(17)
12
0
22
0
100.00
1.00
SMc
6
(6)
6
0
5
0
100.00
1.00
NR
4
(29)
1
1
327
3
50.00
0.35

All
54
(102)
51
1
4224
3
98.08
0.96

Protein family: the protein family being examined; TO pairings: the number of TO pairs in the Hulsen dataset (including many-to-many orthologous pairings and non-UniProtKB/Swiss-Prot proteins); Refined pairings: the number of one-to-one TO pairings tested after refinement of Hulsen TO dataset; Basic statistics: the basic counts of true positives (TP), false positives (FP), true negatives (TN), false negatives (FN); Evaluation statistics: the PPV (positive predictive value, TP/(TP + FP)), and the MCC (Matthews Correlation Coefficient), all rounded to 2dp)

McMillan and Martin BMC Bioinformatics 2008 9:418   doi:10.1186/1471-2105-9-418