Figure 1.

Cross-validation (CV) of three datasets derived from the HapMap 3 resource using v = 5 folds. After subsampling 13,928 markers to minimize linkage disequilibrium, we separately cross-validated datasets containing unrelated individuals from the (a) CEU, (b) CEU, ASW, and YRI, and (c) CEU, ASW, YRI, and MEX HapMap 3 subsamples. Plots display CV error versus K. CV for the CEU dataset suggests K = 1 is the best fit, agreeing with intuition; K = 2 is the best fit for the CEU+ASW+YRI dataset, which contains European, African, and admixed African-American samples; K = 3 is the best fit for CEU+ASW+YRI+MEX, which additionally contains Mexican-Americans.

Alexander and Lange BMC Bioinformatics 2011 12:246   doi:10.1186/1471-2105-12-246
