Table 1

Results from WSD system. Results from WSD system applied to various sections of the NLM-WSD data set using a variety of features and machine learning algorithms. The best results obtained by our system are highlighted in bold font. Results from baseline and previously published approaches are included for comparison.

Features


Data sets

Linguistic

CUI

MeSH

CUI+MeSH

Linguistic+MeSH

Linguistic+CUI

Linguistic+MeSH+CUI


Vector space model


All words

87.0

85.8

81.9

86.9

87.9

87.3

87.5

Joshi subset

82.1

79.6

76.6

81.4

83.3

82.4

82.8

Leroy subset

77.5

74.4

70.4

75.8

79.7

78.7

78.9

Liu subset

84.0

81.3

78.3

83.4

84.8

83.9

84.2

Common subset

79.1

75.1

70.4

76.9

81.1

80.0

79.7


Naive Bayes


All words

86.4

81.2

85.7

81.1

86.4

81.7

81.8

Joshi subset

80.9

73.4

80.1

73.7

81.1

74.1

74.5

Leroy subset

76.9

66.1

74.6

65.9

77.5

66.5

67.2

Liu subset

82.1

75.4

81.7

75.3

82.7

76.3

76.6

Common subset

77.2

66.1

74.7

65.8

79.0

66.7

67.4


Support Vector Machine


All words

85.9

83.5

85.3

84.5

86.2

85.3

86.0

Joshi subset

80.1

76.4

79.5

78.0

80.9

79.1

80.3

Leroy subset

75.5

69.7

72.6

72.0

77.1

74.5

76.3

Liu subset

81.7

78.2

81.0

80.0

82.3

80.6

81.7

Common subset

76.3

69.8

71.6

73.0

78.1

75.1

76.9


Previous Approaches


Per-term

Global


MFS baseline

Liu et al. (2004)

Joshi et al. (2005)

Leroy and Rindflesch (2005)

Joshi et al. (2005)

McInnes et. al. (2007)


All words

78.0

-

-

-

86.2

85.3

Joshi subset

66.9

-

82.5

-

80.9

80.0

Leroy subset

55.3

-

77.4

65.5

75.7

74.5

Liu subset

69.9

78.0

84.9

-

83.3

81.9

Common subset

54.9

-

79.8

68.8

78.1

75.6


Stevenson et al. BMC Bioinformatics 2008 9(Suppl 11):S7   doi:10.1186/1471-2105-9-S11-S7

Open Data