Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Research article

POEM, A 3-dimensional exon taxonomy and patterns in untranslated exons

Keith Knapp1, Ashley Chonka1 and Yi-Ping Phoebe Chen12*

Author Affiliations

1 Faculty of Science and Technology, Deakin University, Victoria, Australia

2 Australia Research Council (ARC) Centre of Excellence in Bioinformatics, Australia

For all author emails, please log on.

BMC Genomics 2008, 9:428  doi:10.1186/1471-2164-9-428

Published: 20 September 2008

Abstract

Background

The existence of exons and introns has been known for thirty years. Despite this knowledge, there is a lack of formal research into the categorization of exons. Exon taxonomies used by researchers tend to be selected ad hoc or based on an information poor de-facto standard. Exons have been shown to have specific properties and functions based on among other things their location and order. These factors should play a role in the naming to increase specificity about which exon type(s) are in question.

Results

POEM (Protein Oriented Exon Monikers) is a new taxonomy focused on protein proximal exons. It integrates three dimensions of information (Global Position, Regional Position and Region), thus its exon categories are based on known statistical exon features. POEM is applied to two congruent untranslated exon datasets resulting in the following statistical properties. Using the POEM taxonomy previous wide ranging estimates of initial 5' untranslated region exons are resolved. According to our datasets, 29–36% of genes have wholly untranslated first exons. Untranslated exon containing sequences are shown to have consistently up to 6 times more 5' untranslated exons than 3' untranslated exons. Finally, three exon patterns are determined which account for 70% of untranslated exon genes.

Conclusion

We describe a thorough three-dimensional exon taxonomy called POEM, which is biologically and statistically relevant. No previous taxonomy provides such fine grained information and yet still includes all valid information dimensions. The use of POEM will improve the accuracy of genefinder comparisons and analysis by means of a common taxonomy. It will also facilitate unambiguous communication due to its fine granularity