SNP discovery strategy with a reference sequence and short NGS reads of a heterozygous diploid genome. Variant frequency (VF) is defined as the ratio of the number of mapped NGS reads within a stack with a nucleotide different from the nucleotide in the reference sequences divided by total number of mapped reads in the stack. The domain of VF is [0,1]. Folded variant frequency (FVF) equals to 1-VF if VF> 0.5 and FV if VF ≤ 0.5. The domain of FVF is [0.0, 0.5]. SNP1 and SNP2 have different VF values but the same FVF. Cutoff values for VF or FVF and read mapping depth must be optimized to reduce the false-positive SNP rate resulting from sequencing and mapping errors. The SNP3 is inferred to be a true SNP and the nucleotide in the reference sequence is inferred to be a sequencing error.
You et al. BMC Genomics 2012 13:354 doi:10.1186/1471-2164-13-354