Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

Evolving stochastic context--free grammars for RNA secondary structure prediction

James WJ Anderson1*, Paula Tataru2, Joe Staines3, Jotun Hein1 and Rune Lyngsø1

Author Affiliations

1 Department of Statistics, University of Oxford, 1 South Parks Road, Oxford, UK

2 Bioinformatics Research Centre, Aarhus University, C.F. Møllers Allé 8, DK–8000 Aarhus C, Denmark

3 Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK

For all author emails, please log on.

BMC Bioinformatics 2012, 13:78  doi:10.1186/1471-2105-13-78

Published: 4 May 2012

Abstract

Background

Stochastic Context–Free Grammars (SCFGs) were applied successfully to RNA secondary structure prediction in the early 90s, and used in combination with comparative methods in the late 90s. The set of SCFGs potentially useful for RNA secondary structure prediction is very large, but a few intuitively designed grammars have remained dominant. In this paper we investigate two automatic search techniques for effective grammars – exhaustive search for very compact grammars and an evolutionary algorithm to find larger grammars. We also examine whether grammar ambiguity is as problematic to structure prediction as has been previously suggested.

Results

These search techniques were applied to predict RNA secondary structure on a maximal data set and revealed new and interesting grammars, though none are dramatically better than classic grammars. In general, results showed that many grammars with quite different structure could have very similar predictive ability. Many ambiguous grammars were found which were at least as effective as the best current unambiguous grammars.

Conclusions

Overall the method of evolving SCFGs for RNA secondary structure prediction proved effective in finding many grammars that had strong predictive accuracy, as good or slightly better than those designed manually. Furthermore, several of the best grammars found were ambiguous, demonstrating that such grammars should not be disregarded.