Table 5

COAD/READ SAMQA Results

Sample Group

Anomaly

Files affected


"CIGAR should have zero elements for unmapped read"

150 files

236 COAD/READ exon capture sequence files.

Files were completely unpaired in sequencing

5 files

Files contained unpaired reads

1 file


"CIGAR should have zero elements for unmapped read"

48 files

48 COAD/READ whole genome files

"MAPQ should be 0 for unmapped read"

48 files

"RG ID on SAMRecord not found in header"

18 files


The technical tests were run across 236 exon capture sequences files and 48 full genome sequence files for COAD/READ cancer samples from the TCGA project. The results identified problems with the files, and also identified 6 files that could not be used for further analysis. The mapping issues found in the whole genome and exon datasets are due to a documented issue within the alignment tools where BWA maps beyond the reference. The tool flags it as an error, but it is non-fatal to SAMQA. The files and SAMQA output are provided in additional file 1.

Robinson et al. BMC Genomics 2011 12:419   doi:10.1186/1471-2164-12-419

Open Data