Table 3

Rules used for the post-expansion step. The rules switch certain part-of-speech tags to NEWGENE tags. We exclude 372/222 nouns from the expansion, and include only 778 particular adjectives in the expansion of noun phrases. NN*: nouns, proper nouns, plurals; JJ: adjective; CD: cardinal digit; DT: determiner; '/' refers to the token itself.

Former POS pattern
Expanded pattern
Limitation

NEWGENE NN*
NEWGENE NEWGENE
all but 372 particular nouns
NN* NEWGENE
NEWGENE NEWGENE
all but 222 particular nouns
JJ NEWGENE
NEWGENE NEWGENE
only 778 particular adjectives
NEWGENE JJ
NEWGENE NEWGENE
only 778 particular adjectives
NEWGENE DT NN*
NEWGENE NEWGENE NEWGENE

NEWGENE CD
NEWGENE NEWGENE

NN* / NEWGENE
NEWGENE NEWGENE NEWGENE

NEWGENE / NN*
NEWGENE NEWGENE NEWGENE


Hakenberg et al. BMC Bioinformatics 2005 6(Suppl 1):S9   doi:10.1186/1471-2105-6-S1-S9