Genome-wide imputation based on the Ilmn650Y data from the 384 Caucasian Americans in the deLiver cohort. We surveyed nine QS scores from 0.1 to 0.9 at 0.1 steps. At each QS score, we computed the faction of SNPs that passed that score (yield) as well as the imputation accuracy using 1% of the SNPS that were randomly masked. In this case, accuracy is defined as the fraction of imputed genotypes that matched the observed genotypes. For each parameter setting, we conducted the random masking and imputation twice and obtained very similar results from the two realizations. Because the accuracy estimation was derived from a large number of SNPs (N ≈ 3K) and a large number of subjects (N = 384), the estimation is very stable. The left panel shows that the imputation performance was better on the Ilmn650Y data (red) than on the Ilmn317K data (black), and that untyped SNPs in weak LD with the assayed SNPs were imputed less successfully (black dotted line). The middle and right panels show that the imputation accuracy and yield were higher for the Ilmn650Y data compared to the Ilmn317K data.
Hao et al. BMC Genetics 2009 10:27 doi:10.1186/1471-2156-10-27