|
BioCreAtIvE Data Sets |
|||||||
| Set |
Number of Sentences |
Number of Entities |
1 word |
2 words |
3 words |
4 words |
> 4 words |
|
|
|||||||
| training |
7500 |
8876 |
46.1% |
25.7% |
14.9% |
6.6% |
6.6% |
| devtest |
2500 |
2975 |
46.6% |
23.9% |
15.1% |
6.7% |
7.7% |
| official test |
5000 |
5949 |
46.1% |
26.7% |
14.3% |
6.2% |
6.7% |
|
This table shows the BioCreAtIvE data including the ratio for the word length, which shows same tendency among sets. | |||||||
Kinoshita et al. BMC Bioinformatics 2005 6(Suppl 1):S4 doi:10.1186/1471-2105-6-S1-S4 |
|||||||