Open Access Research article

Bioinformatics prediction of overlapping frameshifted translation products in mammalian transcripts

Sebastien Ribrioux1*, Adrian Brüngger2, Birgit Baumgarten3, Klaus Seuwen3 and Markus R John3

Author Affiliations

1 Genedata AG, Maulbeerstrasse 46, CH-4016 Basel, Switzerland

2 Basilea Pharmaceutica AG, Grenzacherstrasse 487, CH-4005 Basel, Switzerland

3 Novartis Institutes for Biomedical Research, CH-4002 Basel, Switzerland

For all author emails, please log on.

BMC Genomics 2008, 9:122  doi:10.1186/1471-2164-9-122

Published: 6 March 2008

Abstract

Background

Exceptionally, a single nucleotide sequence can be translated in vivo in two different frames to yield distinct proteins. In the case of the G-protein alpha subunit XL-alpha-s transcript, a frameshifted open reading frame (ORF) in exon 1 is translated to yield a structurally distinct protein called Alex, which plays a role in platelet aggregation and neurological processes. We carried out a novel bioinformatics screen for other possible dual-frame translated sequences, based on comparative genomics.

Results

Our method searched human, mouse and rat transcripts in frames +1 and -1 for ORFs which are unusually well conserved at the amino acid level. We name these conserved frameshifted overlapping ORFs 'matreshkas' to reflect their nested character. Select findings of our analysis revealed that the G-protein coupled receptor GPR27 is entirely contained within a frame -1 matreshka, thrombopoietin contains a matreshka which spans ~70% of its length, platelet glycoprotein IIIa (ITGB3) contains a matreshka with the predicted characteristics of a secreted peptide hormone, while the potassium channel KCNK12 contains a matreshka spanning >400 amino acids.

Conclusion

Although the in vivo existence of translated matreshkas has not been experimentally verified, this genome-wide analysis provides strong evidence that substantial overlapping coding sequences exist in a number of human and rodent transcripts.