Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Research article

Comparative analysis of sequence features involved in the recognition of tandem splice sites

Ralf Bortfeldt1*, Stefanie Schindler2, Karol Szafranski2, Stefan Schuster1 and Dirk Holste34*

Author Affiliations

1 Department of Bioinformatics, Friedrich-Schiller University, Ernst-Abbe-Platz 2, D-07743 Jena, Germany

2 Fritz-Lipmann Institute for Aging Research, Beutenbergstra├če 11, D-07745 Jena, Germany

3 Research Institute of Molecular Pathology, Dr. Bohr-Gasse 7, A-1030, Vienna, Austria

4 Institute of Molecular Biotechnology of the Austrian Academy of Sciences, Dr. Bohr-Gasse 3-5, A-1030, Vienna, Austria

For all author emails, please log on.

BMC Genomics 2008, 9:202  doi:10.1186/1471-2164-9-202

Published: 30 April 2008

Abstract

Background

The splicing of pre-mRNAs is conspicuously often variable and produces multiple alternatively spliced (AS) isoforms that encode different messages from one gene locus. Computational studies uncovered a class of highly similar isoforms, which were related to tandem 5'-splice sites (5'ss) and 3'-splice sites (3'ss), yet with very sparse anecdotal evidence in experimental studies. To compare the types and levels of alternative tandem splice site exons occurring in different human organ systems and cell types, and to study known sequence features involved in the recognition and distinction of neighboring splice sites, we performed large-scale, stringent alignments of cDNA sequences and ESTs to the human and mouse genomes, followed by experimental validation.

Results

We analyzed alternative 5'ss exons (A5Es) and alternative 3'ss exons (A3Es), derived from transcript sequences that were aligned to assembled genome sequences to infer patterns of AS occurring in several thousands of genes. Comparing the levels of overlapping (tandem) and non-overlapping (competitive) A5Es and A3Es, a clear preference of isoforms was seen for tandem acceptors and donors, with four nucleotides and three to six nucleotides long exon extensions, respectively. A subset of inferred A5E tandem exons was selected and experimentally validated. With the focus on A5Es, we investigated their transcript coverage, sequence conservation and base-paring to U1 snRNA, proximal and distal splice site classification, candidate motifs for cis-regulatory activity, and compared A5Es with A3Es, constitutive and pseudo-exons, in H. sapiens and M. musculus. The results reveal a small but authentic enriched set of tandem splice site preference, with specific distances between proximal and distal 5'ss (3'ss), which showed a marked dichotomy between the levels of in- and out-of-frame splicing for A5Es and A3Es, respectively, identified a number of candidate NMD targets, and allowed a rough estimation of a number of undetected tandem donors based on splice site information.

Conclusion

This comparative study distinguishes tandem 5'ss and 3'ss, with three to six nucleotides long extensions, as having unusually high proportions of AS, experimentally validates tandem donors in a panel of different human tissues, highlights the dichotomy in the types of AS occurring at tandem splice sites, and elucidates that human alternative exons spliced at overlapping 5'ss posses features of typical splice variants that could well be beneficial for the cell.