Table 1

Features extracted.

Feature

Value


word

all words in the training data

orthography

capital, symbol, etc. (see Table 2)

prefix

1, 2, or 3 gram of the starting letters of a word

suffix

1, 2, or 3 gram of the ending letters of a word

part of speech

Brill tagger

preceding class

-2, -1

gene/protein name dictionary

protein names collected from SWISS-PROT and TrEMBL


Mitsumori et al. BMC Bioinformatics 2005 6(Suppl 1):S8   doi:10.1186/1471-2105-6-S1-S8

Open Data