Table 3

Efficiency and SNP detection rates of non-barcoded and pooled samples
Minimum read count for SNP call Library ID Positive control SNPs Positive control SNPs in sample Total SNPs in sample Sensitivity False discovery rate
5 768_1L 244 226 376 92,6% 39,9%
768_2L 244 230 371 94,3% 38,0%
10 768_1L 244 212 277 86,9% 23,5%
768_2L 244 212 267 86,9% 20,6%
20 768_1L 244 193 214 79,1% 9,8%
768_2L 244 198 221 81,1% 10,4%

- Minimum read count for SNP call: Minimum number of non-reference allele counts required for a SNP to be considered detected.

- Positive control SNPs: Positive control SNPs generated from the non-pooled, non-barcoded data (759–764). Since the HapMap genotyping data was incomplete, even for known SNPs, we attempted to create a positive control set of SNPs within the targeted regions. If the SNP was detected within samples 759–764, a combined genotype was determined for that SNP position. For example, position X was determined to have a “CG” genotype in sample 759 and position X had the reference genotype of “CC” in samples 760–764, the predicted allele frequency would be 8.3% (1 in 12). In the non-pooled samples, a SNP with a non-reference allele frequency of 10-90% was considered a heterozygote. A homozygous SNP in non-pooled samples was defined as having >90% non-reference allele frequency. The number in this column represents the total number of SNPs that have a non-reference allele within a given pooled sample. Note that these positive control SNPs include HapMap samples with rs IDs, non-HapMap samples with rs IDs, and potentially novel SNPs.

- Positive control SNPs found: This number represents the number of positive control SNPs that were detected in a given pool with a given set of parameters.

- Total SNPs detected: This number represents the total number of SNPs found in a given pool with a given set of parameters. This number contains the “positive SNPs found” plus other SNPs. It is assumed that most of these SNPs are false positives since this number decreases significantly if you increase the stringency of your SNP detection parameters. However, some novel SNPs could exist in this set.

- Sensitivity: In this case, this is simply the percentage of positive controls SNPs found in a given pool with a given set of parameters. Sensitivity decreases as SNP detection stringency increases.

- False Discovery Rate: This was defined as (total SNPs detected – positive control SNPs found)/Total SNPs detected * 100.

ElSharawy et al.

ElSharawy et al. BMC Genomics 2012 13:500   doi:10.1186/1471-2164-13-500

Open Data