This article is part of the supplement: Selected articles from the First IEEE International Conference on Computational Advances in Bio and medical Sciences (ICCABS 2011): Bioinformatics
Maximum expected accuracy structural neighbors of an RNA secondary structure
1 Department of Biology, Boston College, Chestnut Hill, MA 02467, USA
2 Laboratoire de Recherche en Informatique (LRI), Université Paris-Sud XI, 91405 Orsay cedex, France
3 Laboratoire d'Informatique (LIX), Ecole Polytechnique, 91128 Palaiseau, France
4 Department of Mathematics and Computer Science, Denison University, Granville, OH 43023-0810, USA
BMC Bioinformatics 2012, 13(Suppl 5):S6 doi:10.1186/1471-2105-13-S5-S6Published: 12 April 2012
Since RNA molecules regulate genes and control alternative splicing by allostery, it is important to develop algorithms to predict RNA conformational switches. Some tools, such as paRNAss, RNAshapes and RNAbor, can be used to predict potential conformational switches; nevertheless, no existent tool can detect general (i.e., not family specific) entire riboswitches (both aptamer and expression platform) with accuracy. Thus, the development of additional algorithms to detect conformational switches seems important, especially since the difference in free energy between the two metastable secondary structures may be as large as 15-20 kcal/mol. It has recently emerged that RNA secondary structure can be more accurately predicted by computing the maximum expected accuracy (MEA) structure, rather than the minimum free energy (MFE) structure.
Given an arbitrary RNA secondary structure S0 for an RNA nucleotide sequence a = a1,..., an, we say that another secondary structure S of a is a k-neighbor of S0, if the base pair distance between S0 and S is k. In this paper, we prove that the Boltzmann probability of all k-neighbors of the minimum free energy structure S0 can be approximated with accuracy ε and confidence 1 - p, simultaneously for all 0 ≤ k < K, by a relative frequency count over N sampled structures, provided that , where Φ(z) is the cumulative distribution function (CDF) for the standard normal distribution. We go on to describe the algorithm RNAborMEA, which for an arbitrary initial structure S0 and for all values 0 ≤ k < K, computes the secondary structure MEA(k), having maximum expected accuracy over all k-neighbors of S0. Computation time is O(n3 · K2), and memory requirements are O(n2 · K). We analyze a sample TPP riboswitch, and apply our algorithm to the class of purine riboswitches.
The approximation of RNAbor by sampling, with rigorous bound on accuracy, together with the computation of maximum expected accuracy k-neighbors by RNAborMEA, provide additional tools toward conformational switch detection. Results from RNAborMEA are quite distinct from other tools, such as RNAbor, RNAshapes and paRNAss, hence may provide orthogonal information when looking for suboptimal structures or conformational switches. Source code for RNAborMEA can be downloaded from http://sourceforge.net/projects/rnabormea/ webcite or http://bioinformatics.bc.edu/clotelab/RNAborMEA/. webcite