Table 2

Statistics of the benchmark data sets for the GE and CO tasks.

Training

Tuning

Test

Item

Abs.

Full

Abs.

Full

Abs.

Full


Articles

800

5

150

5

260

4

Words

176146

29583

33827

30305

57256

21791

Proteins

9300

2325

2080

2610

3589

1712


Coreferences

2247

-

463

-

714

-


Relative pronouns

1193

-

254

-

349

-

Pronouns

738

-

149

-

269

-

Definite NPs

296

-

58

-

91

-

Appositions

9

-

1

-

3

-

Others

11

-

1

-

2

-


Events

8615

1695

1795

1455

3193

1294


Gene_expression

1738

527

356

393

722

280

Transcription

576

91

82

76

137

37

Protein_catabolism

110

0

21

2

14

1

Phosphorylation

169

23

47

64

139

50

(with Site)

(67)

(0)

(27)

(12)

(81)

(15)

Localization

265

16

53

14

174

17

(with Loc)

(116)

(12)

(32)

(10)

(111)

(2)

Binding

887

101

249

126

349

153

(with Site)

(138)

(34)

(50)

(114)

(24)

(79)

Regulation

961

152

173

123

292

96

(with Site)

(57)

(8)

(39)

(17)

(11)

(3)

Positive_regulation

2847

538

618

382

987

466

(with Site)

(175)

(7)

(75)

(47)

(37)

(7)

Negative_regulation

1062

247

196

275

379

194

(with Site)

(27)

(9)

(6)

(18)

(10)

(7)


The events and the coreferences annotations are used for the GE and CO tasks, respectively.

Kim et al. BMC Bioinformatics 2012 13(Suppl 11):S1   doi:10.1186/1471-2105-13-S11-S1

Open Data