Table 3 |
||
|
Feature sets for learning. |
||
|
Feature set |
Size |
Description |
|
|
||
|
tokN |
8N |
Surface string and POS of tokens surrounding the arguments, windowed -N to +N, N = 6 by default |
|
gentokN |
8N |
Root and generalised POS of tokens surrounding the argument entities, windowed N to +N, N = 6 by default |
|
atype |
1 |
Concatenated semantic type of arguments, in arg1-arg2 order |
|
dir |
1 |
Direction: linear text order of the arguments (is arg1 before arg2, or vice versa?) |
|
dist |
2 |
Distance: absolute number of sentence and paragraph boundaries between arguments |
|
str |
14 |
Surface string features based on Zhou et al [29], see text for full description |
|
pos |
14 |
POS features, as above |
|
root |
14 |
Root features, as above |
|
genpos |
14 |
Generalised POS features, as above |
|
inter |
11 |
Intervening mentions: numbers and types of intervening entity mentions between arguments |
|
event |
5 |
Events: are any of the arguments, or intevening entities, events? |
|
allgen |
96 |
All above features in root and generalised POS forms, i.e. gen-tok6+atype+dir+dist+root+genpos+inter+event |
|
notok |
48 |
All above except tokN features, others in string and POS forms, i.e. atype+dir+dist+str+pos+inter+event |
|
dep |
16 |
Features based on a syntactic dependency path. |
|
syndist |
2 |
The distance between the two arguments, along a token path and along a syntactic dependency path. |
|
|
||
|
Feature sets used for learning relationships. The table is split into non-syntactic features, combined non-syntactic features, and syntactic features. The size of a set is the number of features in that set. |
||
|
Roberts et al. BMC Bioinformatics 2008 9(Suppl 11):S3 doi:10.1186/1471-2105-9-S11-S3 |
||