Table 7

Coverage achieved when the estimated coverage reached 99% (assuming the named entities of the other categories are already annotated in the corpus)

Coverage

# Sentences Annotated

Percentage in the Corpus


CoNLL: LOC

98.5%

5,500

39.2%

CoNLL: MISC

95.0%

3,200

22.8%

CoNLL: ORG

99.0%

5,400

38.5%

CoNLL: PER

97.9%

4,700

33.5%


GENIA: DNA

99.6%

8,200

44.2%

GENIA: RNA

99.5%

1,800

9.7%

GENIA: cell_line

99.3%

5,000

27.0%

GENIA: cell_type

99.2%

7,000

37.7%


Average

98.5%

-

31.6%


Tsuruoka et al. BMC Bioinformatics 2008 9(Suppl 11):S8   doi:10.1186/1471-2105-9-S11-S8

Open Data