Strand-specific RNA-Seq reveals widespread and developmentally regulated transcription of natural antisense transcripts in Plasmodium falciparum
- Equal contributors
1 Biology of Host-Parasite Interactions Unit, Institut Pasteur, Paris, France
2 CNRS URA2581, Paris, France
3 Cell Biology of Parasitism Unit, Institut Pasteur, Paris, France
4 INSERM U786, Paris, France
5 Plate-Forme Transcriptome et Epigénome, Département Génomes et Génétique, Institut Pasteur, Paris, France
6 Present address: Research Center for Infectious Diseases, University Wuerzburg, Josef Schneider-Str. 2/Bau D15, 97080 Wuerzburg, Germany
7 Present address: Institute of Infectious Diseases and Vaccine Development, Tongji University School of Medicine, 1239 Siping Road, Shanghai 200092, China
BMC Genomics 2014, 15:150 doi:10.1186/1471-2164-15-150Published: 22 February 2014
Advances in high-throughput sequencing have led to the discovery of widespread transcription of natural antisense transcripts (NATs) in a large number of organisms, where these transcripts have been shown to play important roles in the regulation of gene expression. Likewise, the existence of NATs has been observed in Plasmodium but our understanding towards their genome-wide distribution remains incomplete due to the limited depth and uncertainties in the level of strand specificity of previous datasets.
To gain insights into the genome-wide distribution of NATs in P. falciparum, we performed RNA-ligation based strand-specific RNA sequencing at unprecedented depth. Our data indicate that 78.3% of the genome is transcribed during blood-stage development. Moreover, our analysis reveals significant levels of antisense transcription from at least 24% of protein-coding genes and that while expression levels of NATs change during the intraerythrocytic developmental cycle (IDC), they do not correlate with the corresponding mRNA levels. Interestingly, antisense transcription is not evenly distributed across coding regions (CDSs) but strongly clustered towards the 3′-end of CDSs. Furthermore, for a significant subset of NATs, transcript levels correlate with mRNA levels of neighboring genes.
Finally, we were able to identify the polyadenylation sites (PASs) for a subset of NATs, demonstrating that at least some NATs are polyadenylated. We also mapped the PASs of 3443 coding genes, yielding an average 3′ untranslated region length of 523 bp.
Our strand-specific analysis of the P. falciparum transcriptome expands and strengthens the existing body of evidence that antisense transcription is a substantial phenomenon in P. falciparum. For a subset of neighboring genes we find that sense and antisense transcript levels are intricately linked while other NATs appear to be regulated independently of mRNA transcription. Our deep strand-specific dataset will provide a valuable resource for the precise determination of expression levels as it separates sense from antisense transcript levels, which we find to often significantly differ. In addition, the extensive novel data on 3′ UTR length will allow others to perform searches for regulatory motifs in the UTRs and help understand post-translational regulation in P. falciparum.