Sequence characteristics of false positive junctions. Sequence logos of exon and intron sequence bordering splice junctions in (A) annotated GT-AG introns, (B) unannotated GT-AG introns detected that pass filtering, and (C) repeat induced false positives. (D) Cartoon illustrating how false positives arise from repetitive sequence and sequencing error. A transcribed fragment from a region of repetitive sequence is incorporated into a library. A base calling error (in red) produces a read with an “A” instead of a “T” at the indicated position. This incorrect base call induces an incorrect gapped alignment that minimizes sequence mismatches. (E) Illustration of each false positive error type (Table 2).
Sturgill et al. BMC Bioinformatics 2013 14:320 doi:10.1186/1471-2105-14-320