Table 5

Predictive performance for different dissimilarity measures

Data set

Dissimilarity measure

Training set

Test set


Accuracy

MCC

Accuracy

MCC


I

Soergel

0.972

0.944

0.854

0.714

Dice

0.979

0.959

0.825

0.653

Manhattana

0.941

0.881

0.861

0.725

Rogers-Tanimoto

0.961

0.923

0.861

0.725


II

Soergel

0.965

0.933

0.860

0.725

Dice

0.965

0.933

0.868

0.745

Manhattana

0.978

0.956

0.904

0.807

Rogers-Tanimoto

0.973

0.946

0.897

0.793


III

Soergel

0.989

0.977

0.846

0.706

Dice

0.989

0.977

0.855

0.717

Manhattan

0.979

0.954

0.838

0.686

Rogers-Tanimotoa

0.947

0.885

0.846

0.711


IV

Soergel

0.904

0.823

0.667

0.301

Dice

0.904

0.823

0.635

0.307

Manhattan

0.957

0.918

0.714

0.433

Rogers-Tanimotoa

0.898

0.811

0.794

0.584


a The model selected based on the number of prototype conformers.

Fu et al. BMC Bioinformatics 2012 13(Suppl 15):S3   doi:10.1186/1471-2105-13-S15-S3

Open Data