Table 11

Effect of trained POS tagger on biomedical literature. POS(A) means the original, newswire trained POS tagger. POS(B) means the POS tagger with trained on GENIA corpus 3.02p. "full(A)" means using all features and the original, newswire trained POS tagger. "full(B)" means using all features and the POS tagger with trained on GENIA corpus 3.02p The parenthesized values are p-values. We compare two cases: "word+POS(A)+pc." vs. "word+POS(B)+pc." and "full(A)" vs. "full(B)". The values in bold have a statistically significant difference in the comparison. A difference is labeled statistically significant when the p-value is less than 0.05 on the Wilcoxon signed-ranks sum test (two-sided).

word+POS(A)+pc.

word+POS(B)+pc

full(A)

full(B)


Precision

0.7813

0.7867

0.8189

0.8177

(0.105)

(0.557)

Recall

0.6423

0.6147

0.7661

0.7640

(0.002)

(0.322)

Balanced f-score

0.7118

0.6900

0.7916

0.7899

(0.002)

(0.432)


Mitsumori et al. BMC Bioinformatics 2005 6(Suppl 1):S8   doi:10.1186/1471-2105-6-S1-S8

Open Data