Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Methodology article

Identification of 3 gene ends using transcriptional and genomic conservation across vertebrates

Marcos Morgan12*, Alessandra Iaconcig1 and Andrés Fernando Muro1*

Author Affiliations

1 International Centre for Genetic Engineering and Biotechnology (ICGEB), Padriciano 99, I 34149, Trieste, Italy

2 Present address: EMBL Monterotondo Adriano Buzzati-Traverso Campus, Via Ramarini 32, 00015, Monterotondo, Italy

For all author emails, please log on.

BMC Genomics 2012, 13:708  doi:10.1186/1471-2164-13-708

Published: 18 December 2012

Abstract

Background

In higher eukaryotes, gene expression is regulated at different levels. In particular, 3UTRs play a central role in translation, stability and subcellular localization of transcripts. In recent years, the development of high throughput sequencing techniques has facilitated the acquisition of transcriptional data at a genome wide level. However, annotation of the 3 ends of genes is still incomplete, thus limiting the interpretation of the data generated. For example, we have previously reported two different genes, ADD2 and CPEB3, with conserved 3UTR alternative isoforms not annotated in the current versions of Ensembl and RefSeq human databases.

Results

In order to evaluate the existence of other conserved 3 ends not annotated in these databases we have now used comparative genomics and transcriptomics across several vertebrate species. In general, we have observed that 3UTR conservation is lost after the end of the mature transcript. Using this change in conservation before and after the 3 end of the mature transcripts we have shown that many conserved ends were still not annotated. In addition, we used orthologous transcripts to predict 3UTR extensions and validated these predictions using total RNA sequencing data. Finally, we used this method to identify not annotated 3 ends in rats and dogs. As a result, we report several hundred novel 3UTR extensions in rats and a few thousand in dogs.

Conclusions

The methods presented here can efficiently facilitate the identification of not-yet-annotated conserved 3UTR extensions. The application of these methods will increase the confidence of orthologous gene models across vertebrates.

Keywords:
pre-mRNA cleavage site; PhyloP; RNA sequencing; TransMap