Table 25

Comparison of our results for S1615 data set with previously published studies.

Ref.

Method

Data Set Size

Accuracy

Information


[41]

SVM

2048

0.77 (20-fold cv)

Seq


[42]

SVM

1383 *

0.73 (20-fold cv)

Seq


[9]

NN

NN+FOLDX

1615

0.79 (20-fold cv)

0.87 (test set†)

0.93 (test set†)

Seq+Str


[2]

SVM

1496‡

SO: 0.84, TO: 0.85, ST: 0.85 (20-fold cv)

SO: 0.86, TO: 0.86, ST: 0.86 (test set)

Seq+Str


[31]

iPTREE

1615

0.87 (10-fold cv)

Seq+Str


Ours

Early

Late

Intermediate

1122 (training)

383 (test)

0.842 (20-fold cv), 0.904 (test set)

0.847 (20-fold cv), 0.903 (test set)

0.826 (20-fold cv), 0.879 (test set)

Seq+Str


*Filtered from the set of 2048 mutations [41].

† A subset of the training set that was previously used in training.

‡ Filtered from the set of 1615 mutations [9].

Machine learning method, data set, performance assessment are the main features to be compared. (Seq: Sequence-based information, Seq+Str: Sequence- and structure-based information)

Özen et al. BMC Structural Biology 2009 9:66   doi:10.1186/1472-6807-9-66

Open Data