Table 7 |
||||
|
Dictionary lookup performance. This table shows the speed and accuracy of dictionary lookup tasks using the human gene/protein dictionary and gene/protein name snippets. F-score is the harmonic mean of precision and recall. The values in the parentheses are the threshold values in soft string matching. |
||||
|
Method |
Precision |
Recall |
F-score |
Average lookup time (microsecond) |
|
|
||||
|
Bigram similariy (0.97) |
0.758 |
0.587 |
0.661 |
6.7 × 105 |
|
Bigram similariy (0.95) |
0.691 |
0.592 |
0.638 |
6.8 × 105 |
|
Bigram similariy (0.93) |
0.612 |
0.610 |
0.611 |
6.8 × 105 |
|
No normalization |
0.809 |
0.502 |
0.619 |
7 |
|
Case normalization |
0.782 |
0.582 |
0.666 |
8 |
|
Heuristic normalization [18] |
0.730 |
0.657 |
0.692 |
8 |
|
Automatic normalization |
0.767 |
0.633 |
0.694 |
29 |
|
|
||||
|
Tsuruoka et al. BMC Bioinformatics 2008 9(Suppl 3):S2 doi:10.1186/1471-2105-9-S3-S2 |
||||