Table 3

Feature statistics for different datasets (GM = generic model; BM = biological model). Note that the feature list used in the BM model is longer than that of the GM model due to the additional binary biological features (has-protein, has-two-proteins, etc.).



TF data
PPI Data
NonPF Data

total # features
GM
1327
1188
1780

BM
803
760
1306
# features per sentence
GM
9.70
14.44
11.43

BM
12.87
17.73
9.78

Yang et al. BMC Bioinformatics 2008 9(Suppl 3):S11   doi:10.1186/1471-2105-9-S3-S11