Table 2

Performance measures of data mining algorithm at different levels of significance over Type 1 diabetes dataset
SIGNIFICANCE p < 5 x 10-13 p < 5 x 10-10 p < 5 x 10-7 p < 5 x 10-4
Algorithm
    Acc.
    Sp
    Sn
    AUC
    Acc.
    Sp
    Sn
    AUC
    Acc.
    Sp
    Sn
    AUC
    Acc
    Sp
    Sn
    AUC
    Avg.
SLR 87.5 85.0 89.7 0.93 92.5 90.2 94.9 0.97 92.5 92.0 92.0 0.96 92.5 90.0 94.9 0.96 92.2
Naïve Bayes 90.0 85.4 95.0 0.97 91.3 90.2 92.3 0.98 92.5 90.2 95.0 0.96 89.0 85.4 92.3 0.92 92.0
SVM 88.8 82.9 94.9 0.89 90.0 82.9 97.4 0.90 93.8 90.2 97.4 0.93 93.8 92.7 94.9 0.94 91.6
R. Forest 87.5 87.8 87.2 0.96 92.5 90.2 94.9 0.97 91.5 87.8 94.9 0.97 88.8 85.4 92.3 0.94 91.5
KNN 92.5 90.2 94.9 0.95 95.0 92.7 97.4 0.96 90.0 85.4 94.9 0.93 85.0 80.5 89.7 0.90 91.4
Logistic. R 86.3 87.8 84.6 0.82 92.5 90.2 94.9 0.97 92.5 92.7 97.4 0.97 87.5 92.7 82.1 0.92 90.6
VFI 87.5 82.9 92.3 0.95 92.5 90.2 94.9 0.97 88.8 85.4 92.3 0.95 87.5 82.9 92.3 0.92 90.5
Bayes Net 91.3 90.2 92.3 0.97 90.0 85.4 94.9 0.98 90.0 85.4 94.9 0.95 83.8 78.0 89.7 0.89 90.3
MLP 80.0 80.5 79.5 0.89 91.3 90.2 92.3 0.98 93.8 90.2 97.4 0.99 dnf dnf dnf dnf 90.1*
Hyper Pipes 87.5 90.2 84.6 0.96 91.3 90.2 92.3 0.97 90.0 90.2 89.7 0.95 83.8 92.7 74.4 0.92 89.8
K-means 91.3 82.9 100 0.92 90.0 82.9 97.4 0.90 86.3 78.0 94.9 0.87 85.0 75.6 94.9 0.85 88.3
M5P 88.8 85.4 92.3 0.94 85.0 80.5 89.7 0.94 81.3 78.0 84.6 0.87 78.8 73.2 84.6 0.85 85.1
Random Tree 85.0 87.8 82.1 0.85 78.8 75.6 82.1 0.79 87.5 85.4 89.7 0.88 83.8 85.4 82.1 0.84 83.8
K star 87.5 87.8 87.2 0.96 91.3 85.4 97.4 0.98 90.0 85.4 94.9 0.97 53.8 100 5.1 0.54 81.9
J48 86.3 85.4 87.2 0.79 81.3 82.9 79.5 0.83 78.8 82.9 74.4 0.72 80.0 85.4 74.4 0.73 80.3
ASC 86.3 85.4 87.2 0.79 80.0 82.9 76.9 0.80 80.0 87.8 71.8 0.78 66.3 80.5 51.3 0.55 76.8
LDA 88.8 82.9 94.9 0.96 91.3 85.4 97.4 0.95 40.0 96.7 15.8 0.68 21.3 94.4 0.0 0.48 69.7

Acc: Accuracy, Sp: Specificity, Sn: Sensitivity, AUC: Area under ROC curve, Avg: Average score in % for each algorithms, dnf: “Did Not Finish”, * denotes Avg. from 3 significance levels. Measures >90% are marked in bold.

Kukreja et al.

Kukreja et al. BMC Bioinformatics 2012 13:139   doi:10.1186/1471-2105-13-139

Open Data