Identification of candidate regulatory sequences in mammalian 3' UTRs by statistical analysis of oligonucleotide distributions
1 Dept of Theoretical Physics, University of Turin and INFN, Turin, Italy
2 Molecular Biotechnology Center and Dept. of Genetics, Biology and Biochemistry, University of Turin, Italy
BMC Bioinformatics 2007, 8:174 doi:10.1186/1471-2105-8-174Published: 24 May 2007
3' untranslated regions (3' UTRs) contain binding sites for many regulatory elements, and in particular for microRNAs (miRNAs). The importance of miRNA-mediated post-transcriptional regulation has become increasingly clear in the last few years.
We propose two complementary approaches to the statistical analysis of oligonucleotide frequencies in mammalian 3' UTRs aimed at the identification of candidate binding sites for regulatory elements. The first method is based on the identification of sets of genes characterized by evolutionarily conserved overrepresentation of an oligonucleotide. The second method is based on the identification of oligonucleotides showing statistically significant strand asymmetry in their distribution in 3' UTRs.
Both methods are able to identify many previously known binding sites located in 3'UTRs, and in particular seed regions of known miRNAs. Many new candidates are proposed for experimental verification.