Table 3

Rules used for the post-expansion step. The rules switch certain part-of-speech tags to NEWGENE tags. We exclude 372/222 nouns from the expansion, and include only 778 particular adjectives in the expansion of noun phrases. NN*: nouns, proper nouns, plurals; JJ: adjective; CD: cardinal digit; DT: determiner; '/' refers to the token itself.

Former POS pattern

Expanded pattern

Limitation


NEWGENE NN*

NEWGENE NEWGENE

all but 372 particular nouns

NN* NEWGENE

NEWGENE NEWGENE

all but 222 particular nouns

JJ NEWGENE

NEWGENE NEWGENE

only 778 particular adjectives

NEWGENE JJ

NEWGENE NEWGENE

only 778 particular adjectives

NEWGENE DT NN*

NEWGENE NEWGENE NEWGENE

NEWGENE CD

NEWGENE NEWGENE

NN* / NEWGENE

NEWGENE NEWGENE NEWGENE

NEWGENE / NN*

NEWGENE NEWGENE NEWGENE


Hakenberg et al. BMC Bioinformatics 2005 6(Suppl 1):S9   doi:10.1186/1471-2105-6-S1-S9

Open Data