Table 1

Function prediction performance using KNN for the gold-standard dataset

Superfamily

Before SVD

After SVD

∆Prec.

∆Rec.


Precision

Recall

Precision

Recall

Amidohydrolase

0.983

0.983

1.000

1.000

+1.7%

+1.7%

Crotonase

0.955

0.953

0.979

0.977

+2.4%

+2.4%

Enolase

0.876

0.853

0.971

0.967

+9.5%

+11.4%

Haloacid Dehalogenase

0.881

0.925

0.984

0.981

+10.3%

+5.6%

Isoprenoid Synthase Type I

1.000

1.000

1.000

1.000

+0.0%

+0.0%

Vicinal Oxygen Chelate

1.000

1.000

1.000

1.000

+0.0%

+0.0%

All

0.901

0.903

0.991

0.989

+9.0%

+8.6%


Prediction performance for the gold-standard dataset using KNN. The experiment was performed in an intra-superfamily fashion, and the classes for prediction represent the enzyme’s families. The precision and recall metrics are weighted averages. Ten-fold cross validation was employed.

Pires et al. BMC Genomics 2011 12(Suppl 4):S12   doi:10.1186/1471-2164-12-S4-S12

Open Data