Table 3

Performance of different prediction methods using a balanced dataset of mutations that map to structure extracted from HumVar.

Method

Cross-validated

MCC

Acc


SAAPpred

Yes

0.692

0.846

SAAPpred

No

0.894

0.944

PolyPhen2

No

0.572

0.785

SIFT

?

0.528

0.763

MutationAssessor

N/A

0.453

0.698


The values for the cross-validated assessment of SAAPpred were obtained from 10-fold cross-validation performed during the Weka training and used all 1540 SNPs from HumVar that mapped to structure with a random sample of 1540 of the 7182 PDs that mapped to structure. This was repeated 10-times and the results averaged. Non cross-validated results were performed by using a slightly smaller set of 1451 SNPs that mapped to structure and could be assessed by all the other methods together with a random sample of 1451 PDs that could be assessed by all methods. Again this was repeated 10-times and the results averaged. The non-cross-validated values for SAAPpred give the fairest comparison with PolyPhen2 which is trained on the HumVar dataset. It is unclear exactly what data were used in training the most recent version of SIFT so there may be some overlap between training and test sets while MutationAssessor has no training set per se.

Al-Numair and Martin BMC Genomics 2013 14(Suppl 3):S4   doi:10.1186/1471-2164-14-S3-S4

Open Data