Table 5

Prediction accuracy in the UCI machine learning benchmark data
Data set RGLM RGLM.inter2 RF RFbigmtry Rpart LDA DLDA KNN SVM SC
BreastCancer 0.964 0.959 0.969 0.961 0.941 0.957 0.959 0.966 0.967 0.956
HouseVotes84 0.961 0.963 0.958 0.954 0.954 0.951 0.914 0.924 0.958 0.938
Ionosphere 0.883 0.946 0.932 0.917 0.875 0.863 0.809 0.849 0.940 0.829
diabetes 0.768 0.759 0.759 0.754 0.741 0.768 0.732 0.740 0.757 0.743
Sonar 0.769 0.837 0.817 0.788 0.707 0.726 0.697 0.812 0.822 0.726
ringnorm 0.577 0.973 0.940 0.910 0.770 0.567 0.570 0.590 0.977 0.535
threenorm 0.803 0.827 0.807 0.777 0.653 0.817 0.825 0.815 0.853 0.817
twonorm 0.937 0.953 0.947 0.920 0.733 0.957 0.960 0.947 0.953 0.960
Glass 0.636 0.743 0.827 0.799 0.729 0.659 0.531 0.808 0.748 0.645
Satellite 0.986 0.987 0.988 0.985 0.961 0.985 0.734 0.990 0.988 0.803
Vehicle 0.965 0.986 0.986 0.973 0.944 0.967 0.729 0.909 0.974 0.752
Vowel 0.936 0.986 0.983 0.976 0.950 0.938 0.853 0.999 0.991 0.909
MeanAccuracy 0.849 0.910 0.909 0.893 0.830 0.846 0.776 0.862 0.911 0.801
Rank 6 2 2 4 8 7 10 5 2 9
Pvalue 0.0093 NA 0.26 0.042 0.00049 0.0093 0.0067 0.11 0.96 0.0015

For each data set, the prediction accuracy was estimated using 3−fold cross validation across 100 random partitions of the data into 3 folds. RGLM.inter2 incorporates pairwise interaction between features into the RGLM predictor. Mean accuracies and the resulting ranks are summarized at the bottom. The Wilcoxon signed rank test was used to test whether accuracy differences between RGLM.inter2 and other predictors are significant. RGLM.inter2, RF, and SVM tie for first place (resulting in a rank of 2 for each method).

Song et al.

Song et al. BMC Bioinformatics 2013 14:5   doi:10.1186/1471-2105-14-5

Open Data