Table 6

Prediction performance on biological process classes, over the dataset of textless proteins.

Function

# Test Proteins

Text-KNN (Textless)

Text-KNN (Cross-validation)



P

R

F

P

R

F



GO:0065007

19

0.28

0.47

0.35

0.23

0.52

0.31



GO:0032502

18

0.19

0.22

0.21

0.22

0.19

0.20



GO:0009987

8

0.04

0.13

0.06

0.24

0.29

0.26



GO:0050896

20

0.38

0.30

0.33

0.25

0.16

0.19



GO:0008152

7

0.29

0.29

0.29

0.23

0.14

0.17



GO:0051234

9

0.33

0.33

0.33

0.32

0.20

0.25



GO:0016043

6

0.00

0.00

0.00

0.13

0.05

0.07



GO:0023052

3

0.00

0.00

0.00

0.18

0.11

0.14



GO:0032501

9

0.00

0.00

0.00

0.12

0.02

0.04



GO:0022414

7

0.00

0.00

0.00

0.51

0.15

0.24



GO:0051704

1

0.00

0.00

0.00

0.00

0.00

0.00



GO:0040011

3

0.00

0.00

0.00

0.00

0.00

0.00



GO:0002376

1

0.00

0.00

0.00

0.00

0.00

0.00


Prediction performance of Text-KNN on proteins that have no associated text is shown in the Text-KNN (Textless) column. As a point of reference, the average cross-validation results, denoted as Text-KNN (Cross-Validation) as obtained over the whole cross-validation dataset, are shown for comparison only. The columns P, R, and F refer, respectively, to the Precision, Recall, and F-measure of the classifier over individual GO categories. A precision and recall values of 0 on a class indicates that all the proteins belonging to that class are misclassified into another class.

Wong and Shatkay BMC Bioinformatics 2013 14(Suppl 3):S14   doi:10.1186/1471-2105-14-S3-S14

Open Data