Systematic investigation of insertional and deletional RNA-DNA differences in the human transcriptome
1 Biophysics Graduate Program, The Ohio State University, Columbus, OH, USA
2 Departments of Physics and Chemistry & Biochemistry, The Ohio State University, Columbus, OH, USA
3 Center for RNA Biology, The Ohio State University, Columbus, OH, USA
BMC Genomics 2012, 13:616 doi:10.1186/1471-2164-13-616Published: 13 November 2012
The genomic information which is transcribed into the primary RNA can be altered by RNA editing at the transcriptional or post-transcriptional level, which provides an effective way to create transcript diversity in an organism. Altering can occur through substitutional RNA editing or via the insertion or deletion of nucleotides relative to the original template. Taking advantage of recent high throughput sequencing technology combined with bioinformatics tools, several groups have recently studied the genome-wide substitutional RNA editing profiles in human. However, while insertional/deletional (indel) RNA editing is well known in several lower species, only very scarce evidence supports the existence of insertional editing events in higher organisms such as human, and no previous work has specifically focused on indel differences between RNA and their matching DNA in human. Here, we provide the first study to examine the possibility of genome-wide indel RNA-DNA differences in one human individual, NA12878, whose RNA and matching genome have been deeply sequenced.
We apply different computational tools that are capable of identifying indel differences between RNA reads and the matching reference genome and we initially find hundreds of such indel candidates. However, with careful further analysis and filtering, we conclude that all candidates are false-positives created by splice junctions, paralog sequences, diploid alleles, and known genomic indel variations.
Overall, our study suggests that indel RNA editing events are unlikely to exist broadly in the human transcriptome and emphasizes the necessity of a robust computational filter pipeline to obtain high confidence RNA-DNA difference results when analyzing high throughput sequencing data as suggested in the recent genome-wide RNA editing studies.