Binary outcome prediction in empirical gene expression data sets. The boxplots show the test set prediction accuracies across 700 comparisons. The horizontal line inside each box represents the median accuracy. The horizontal dashed red line indicates the median accuracy of the RGLM predictor. P-values result from using the two-sided Wilcoxon signed rank test for evaluating whether the median accuracy of RGLM is the same as that of the mentioned method. For example, p.RF results from testing whether the median accuracy of RGLM is the same as that of the RF. (A) summarizes the test set performance for predicting 100 dichotomized gene traits from each of the 7 expression data sets. (B-H) show the results for individual data sets. 100 randomly chosen, dichotomized gene traits were used. Note the superior accuracy of the RGLM predictor across the different data sets.
Song et al. BMC Bioinformatics 2013 14:5 doi:10.1186/1471-2105-14-5