Table 3

Devel and test results for the BioNLP'11 Shared Task

Corpus

Devel F

Test F


GE'09 task 1

56.27

53.15

GE'09 task 2

54.25

50.68


GE task 1

55.78

53.30

GE task 2

53.39

51.97

GE task 3

38.34

26.86

EPI

56.41

53.33

ID

44.92

42.57

BB

27.01

26

BI

77.24

77

CO

36.22

23.77

REL

65.99

57.7

REN

84.62

87.0


The performance of our new system on the BioNLP'09 ST GENIA dataset is shown for reference, with task 3 omitted due to a changed metric. For GE-tasks, the Approximate Span & Recursive matching criterion is used. In many tasks, the development and test set results differ considerably, which may be partially explained by noise unseen due to lack of cross-validation and by the event distribution not being stratified across the sets.

Bj√∂rne et al. BMC Bioinformatics 2012 13(Suppl 11):S4   doi:10.1186/1471-2105-13-S11-S4

Open Data