Performance as an effect of dataset size. Average AUCs and 95% confidence intervals for a subset of sources and the baselines by training set size, based on three five-fold cross-validations. The ‘Surface factor’ virulence class is omitted due to the small number of instances present in the training set.
Cadag et al. BMC Bioinformatics 2012 13:321 doi:10.1186/1471-2105-13-321