Table 15

Surface and parsing features generated from sentence text used for training non-kernel based classifiers
Feature type Feature Example
surface distance (word/char) sentence length in characters
entity distance in words
count number of proteins in sentence
negation clues (s/b/w/a) negation word before entities
hedge clues (s/b/w/a) hedge word after entities
enumeration clues (b) comma between entities
interaction word clues (s/b/w/a) interaction word in sentence
entity modifier (a) -ing word after first entity
parsing distance (graph) length of syntax tree shortest path
occurrence features (entire graph) number of conj constituents in the syntax tree
occurrence features (shortest path) number of conj constituents along the shortest path in the syntax tree
frequency features (entire graph) relative frequency of conj labels over the dependency graph
frequency features (shortest path) relative frequency of conj labels over the shortest path relations
entropy KullbackÔÇôLeibler divergence of constituent types in the entire syntax tree

Features may refer to both sentence and pair level characteristics. Parsing features were generated from both syntax and dependency parses. Scope of features are typically sentence (s), before entities (b), between entities (w), after entities (a).

Tikk et al.

Tikk et al. BMC Bioinformatics 2013 14:12   doi:10.1186/1471-2105-14-12

Open Data