Log on / register
Feedback | Support | My details
Open AccessHighly AccessResearch article

Improving the prediction of mRNA extremities in the parasitic protozoan Leishmania

Martin Smith1,2 email, Mathieu Blanchette2 email and Barbara Papadopoulou1 email

1Research Centre in Infectious Diseases, CHUL Research Centre, 2705 Laurier Blvd., Quebec, QC G1V 4G2, Canada

2McGill Center for Bioinformatics, 3775 University St., Montreal, QC H3A 2B4, Canada

author email corresponding author email

BMC Bioinformatics 2008, 9:158doi:10.1186/1471-2105-9-158

Published: 20 March 2008

Abstract

Background

Leishmania and other members of the Trypanosomatidae family diverged early on in eukaryotic evolution and consequently display unique cellular properties. Their apparent lack of transcriptional regulation is compensated by complex post-transcriptional control mechanisms, including the processing of polycistronic transcripts by means of coupled trans-splicing and polyadenylation. Trans-splicing signals are often U-rich polypyrimidine (poly(Y)) tracts, which precede AG splice acceptor sites. However, as opposed to higher eukaryotes there is no consensus polyadenylation signal in trypanosomatid mRNAs.

Results

We refined a previously reported method to target 5' splice junctions by incorporating the pyrimidine content of query sequences into a scoring function. We also investigated a novel approach for predicting polyadenylation (poly(A)) sites in-silico, by comparing query sequences to polyadenylated expressed sequence tags (ESTs) using position-specific scanning matrices (PSSMs). An additional analysis of the distribution of putative splice junction to poly(A) distances helped to increase prediction rates by limiting the scanning range. These methods were able to simplify splice junction prediction without loss of precision and to increase polyadenylation site prediction from 22% to 47% within 100 nucleotides.

Conclusion

We propose a simplified trans-splicing prediction tool and a novel poly(A) prediction tool based on comparative sequence analysis. We discuss the impact of certain regions surrounding the poly(A) sites on prediction rates and contemplate correlating biological mechanisms. This work aims to sharpen the identification of potentially functional untranslated regions (UTRs) in a large-scale, comparative genomics framework.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.