Figure 6.

ROC-like curves summarizing the performances of the pseudo- and the RNA-seq validation methods. For each method, across the range of thresholds for the tumor sample, we computed the true positive rate (y-axis) at each false discovery rate (x-axis). The evaluation was done based on the 6,692 variant sites detected within 76 genes from 39 LUSC exome-seq pairs, for which, 334 sites were validated as ‘somatic’ based on the deep-sequencing data. For the pseudo-validation method, across the range of thresholds for the tumor GATK quality scores (marked with letters), a site was identified as pseudo-positive if the tumor score is larger than the threshold and the signed GATK quality score for the normal sample is less than -50. When ‘SbiasFilter’ is applied, a site becomes non-somatic if more than 95% or less than 5% of the variant alleles but less than 70% or larger than 30% of the reference alleles are on the forward strand. For the RNA-seq validation method, across the range of thresholds for the tumor RNA-seq vaf, a site was identified as positive if the tumor RNA-seq vaf is larger than the threshold and the normal exome-seq vaf is less than 2%. When ‘SbiasFilter’ is applied, a site becomes non-somatic if more than 95% or less than 5% are on the forward strand for the variant allele in both the tumor exom-seq and the RNA-seq data.

Kim and Speed BMC Bioinformatics 2013 14:189   doi:10.1186/1471-2105-14-189
Download authors' original image