Table 3

Classifier performance in predicting GO terms in mouse, quantified by area under the ROC curve (AUC) and precision at 20% recall (P20R).

Namespace

AUC

P20R

MF

BP

CC

MF

BP

CC


Cross-species

0.90

0.67

0.81

0.52

0.16

0.42

Species-specific

0.86

0.83

0.86

0.42

0.29

0.46

Multi-view

0.91

0.81

0.88

0.57

0.30

0.58

Chain

0.89

0.82

0.87

0.51

0.28

0.52


The cross-species classifier uses only sequence data; the species-specific classifier uses a collection of genomic data--PPI, gene expression, and protein-GO term co-mention mined from the biomedical literature. The multi-view and chain classifiers are two approaches for integrating cross-species and species-specific data. The presented values are averages across all GO terms considered in a particular namespace. The results were obtained using five-fold cross-validation.

Sokolov et al. BMC Bioinformatics 2013 14(Suppl 3):S10   doi:10.1186/1471-2105-14-S3-S10

Open Data