A comparison of RNA folding measures
1 The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, Sweden.
2 Dept. of Evolutionary Biology, University of Copenhagen, Universitetsparken 15, 2100 Copenhagen Ø, Denmark.
3 School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK.
BMC Bioinformatics 2005, 6:241 doi:10.1186/1471-2105-6-241Published: 3 October 2005
In the last few decades there has been a great deal of discussion concerning whether or not noncoding RNA sequences (ncRNAs) fold in a more well-defined manner than random sequences. In this paper, we investigate several existing measures for how well an RNA sequence folds, and compare the behaviour of these measures over a large range of Rfam ncRNA families. Such measures can be useful in, for example, identifying novel ncRNAs, and indicating the presence of alternate RNA foldings.
Our analysis shows that ncRNAs, but not mRNAs, in general have lower minimal free energy (MFE) than random sequences with the same dinucleotide frequency. Moreover, even when the MFE is significant, many ncRNAs appear to not have a unique fold, but rather several alternative folds, at least when folded in silico. Furthermore, we find that the six investigated measures are correlated to varying degrees.
Due to the correlations between the different measures we find that it is sufficient to use only two of them in RNA folding studies, one to test if the sequence in question has lower energy than a random sequence with the same dinucleotide frequency (the Z-score) and the other to see if the sequence has a unique fold (the average base-pair distance, D).