Additional file 4.

Figure S3, Additional file 4. dbSNP concordance of e-genotype. One human sample of SOLiD data (26X coverage) from the 1000 Genomes pilot data was analyzed for the ~1 million SNPs identified in the project for that sample. Color coding for the bars as well as the lines is as follows: good homozygous calls (blue), heterozygotes called as homozygotes (yellow), homozygotes called as the wrong homozygote (grey), good heterozygote calls (green), and erroneously called heterozygotes (red). The total miscall rate was ~2.6%. Excluding very high and very low coverage errors (shown outside of dotted lines, due to bad heterozygous SNP calls from repetitive regions or coverage too low to detect heterozygotes, respectively) the miscall rate was determined to be 1.6%. Overall probe coverage was reduced to 8.2X due to the extremely stringent requirement of exact matching of probes for the full 31 bp probe length. Percentages of homozygous or heterozygous calls that fit into each category of good or error calls were linearly graphed relative to probe coverage, indicating that errors were much more likely at very high and very low probe coverage, while good calls were most likely in intermediate coverage ranges.

Format: PDF Size: 99KB Download file

This file can be viewed with: Adobe Acrobat Reader

Fawcett et al. BMC Genomics 2011 12:311   doi:10.1186/1471-2164-12-311