Figure 3.

Prediction experiments. LOESS-smoothed AUC and explained phenotypic variance (denoted “VarExp”), for the Finnish celiac disease dataset, for increasing model sizes. AUC is estimated over 20×3-fold cross-validation, except for HyperLasso for which we ran only 2×3-fold cross-validation due to the high computational cost. The explained phenotypic variance is estimated from the AUC using the method of [11], assuming a population prevalence of celiac disease K=1%. Note that glmnet, HyperLasso, LIBLINEAR (denoted “LL-L1”), and SparSNP used an 1-penalised model, whereas LIBLINEAR-CDBLOCK (denoted “LL-CD-L2”) used an 2-penalised model (non sparse), inducing a model using all 516,504 SNPs, therefore it is shown as a horizontal line across all model sizes. Note that tuning the 2penalty for LIBLINEAR-CDBLOCK resulted in very similar AUC

Abraham et al. BMC Bioinformatics 2012 13:88   doi:10.1186/1471-2105-13-88
Download authors' original image