Supervised analysis. A. Results of logistic regression on the Validation dataset. On top are reported the AUCs of the models, whereas on the bottom the parameters that are used in the corresponding model are shown. A colored square means that the parameter was used in the model, whereas a white square means that the parameter was not used. The last row reports the number of genes that were used by the model, if any. Models including clinical and molecular parameters are reported only if they improved on the corresponding models using clinical parameters only. Models are sorted from left to right according to their AUC. We estimated the Confidence intervals (CIs) for models including genes using the sampling distribution of AUCs generated by the iterative cross-validation procedure on the Learning set. For the other models, a bootstrap estimation of CIs was computed on the Validation set. The genes that are involved in the models are reported in Additional file 1, Table S4. B. Contingency table showing ERG rearrangement status association with clinical outcome. In parenthesis the expected numbers of cases if no association is assumed.
Sboner et al. BMC Medical Genomics 2010 3:8 doi:10.1186/1755-8794-3-8