Table 9 |
||||||
|
Runs on the test set (after code correction) |
||||||
|
Run |
Precision |
Recall |
F-Score |
MCC |
AUC iP/R |
Total Docs Evaluated |
|
|
||||||
|
All |
2.50% |
93.17% |
0.0487 |
0.0908 |
0.1852 |
222 |
|
Top 40 |
4.83% |
82.92% |
0.0913 |
0.1604 |
0.1583 |
222 |
|
RScore ≥6 |
26.61% |
50.58% |
0.3488 |
0.3535 |
0.1522 |
214 |
|
RScore ≥7 |
28.44% |
48.62% |
0.3589 |
0.3591 |
0.1524 |
210 |
|
|
||||||
|
The table shows the results of running our (corrected) program, on the BC 3 test set. The measurements shown are of precision, recall, F-score, Matthews Correlation Coefficient (MCC), Area under the Curve, and the total number of articles being evaluated by our program. The rows reflect four different runs: The first based on pattern-matching of methods to the text alone (All); the second scoring the sentence-method associations and reporting the top 40 scoring methods; the third reporting the top scoring methods whose raw score was at least 6, while the last reporting the top scoring methods whose top score was at least 7. |
||||||
|
Lourenço et al. BMC Bioinformatics 2011 12(Suppl 8):S12 doi:10.1186/1471-2105-12-S8-S12 |
||||||