Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

Predicting RNA secondary structure by the comparative approach: how to select the homologous sequences

Stéfan Engelen and Fariza Tahi*

Author Affiliations

Laboratoire IBISC – FRE CNRS 2873. CNRS, Université d'Evry Val-d'Essonne, Genopole. 523, place des Terrasses. 91000 Evry, France

For all author emails, please log on.

BMC Bioinformatics 2007, 8:464  doi:10.1186/1471-2105-8-464

Published: 28 November 2007

Abstract

Background

The secondary structure of an RNA must be known before the relationship between its structure and function can be determined. One way to predict the secondary structure of an RNA is to identify covarying residues that maintain the pairings (Watson-Crick, Wobble and non-canonical pairings). This "comparative approach" consists of identifying mutations from homologous sequence alignments. The sequences must covary enough for compensatory mutations to be revealed, but comparison is difficult if they are too different. Thus the choice of homologous sequences is critical. While many possible combinations of homologous sequences may be used for prediction, only a few will give good structure predictions. This can be due to poor quality alignment in stems or to the variability of certain sequences. This problem of sequence selection is currently unsolved.

Results

This paper describes an algorithm, SSCA, which measures the suitability of sequences for the comparative approach. It is based on evolutionary models with structure constraints, particularly those on sequence variations and stem alignment. We propose three models, based on different constraints on sequence alignments. We show the results of the SSCA algorithm for predicting the secondary structure of several RNAs. SSCA enabled us to choose sets of homologous sequences that gave better predictions than arbitrarily chosen sets of homologous sequences.

Conclusion

SSCA is an algorithm for selecting combinations of RNA homologous sequences suitable for secondary structure predictions with the comparative approach.