Table 5

Prediction performance on molecular function classes, over the dataset of textless proteins.

Function

# Textless Proteins

Text-KNN (Textless)

Text-KNN (Cross-validation)



P

R

F

P

R

F



GO:0005488

58

0.82

0.47

0.59

0.65

0.88

0.75



GO:0003824

9

0.29

0.56

0.38

0.52

0.23

0.32



GO:0030528

1

0.04

1.00

0.08

0.44

0.24

0.31



GO:0005215

5

0.50

0.20

0.29

0.59

0.38

0.46



GO:0060089

7

0.44

0.57

0.50

0.39

0.16

0.22



GO:0005198

2

0.00

0.00

0.00

0.04

0.01

0.01


Prediction performance of Text-KNN on proteins that have no associated text is shown in the Text-KNN (Textless) column. As a point of reference, the average cross-validation results, denoted as Text-KNN (Cross-Validation) as obtained over the whole cross-validation dataset, are shown for comparison only. The columns P, R, and F refer, respectively, to the Precision, Recall, and F-measure of the classifier over individual GO categories. A precision and recall values of 0 on a class indicates that all the proteins belonging to that class are misclassified into another class.

Wong and Shatkay BMC Bioinformatics 2013 14(Suppl 3):S14   doi:10.1186/1471-2105-14-S3-S14

Open Data