Abstract
Background
Accurate and efficient RNA secondary structure prediction remains an important open problem in computational molecular biology. Historically, advances in computing technology have enabled faster and more accurate RNA secondary structure predictions. Previous parallelized prediction programs achieved significant improvements in runtime, but their implementations were not portable from niche highperformance computers or easily accessible to most RNA researchers. With the increasing prevalence of multicore desktop machines, a new parallel prediction program is needed to take full advantage of today’s computing technology.
Findings
We present here the first implementation of RNA secondary structure prediction by thermodynamic optimization for modern multicore computers. We show that GTfold predicts secondary structure in less time than UNAfold and RNAfold, without sacrificing accuracy, on machines with four or more cores.
Conclusions
GTfold supports advances in RNA structural biology by reducing the timescales for secondary structure prediction. The difference will be particularly valuable to researchers working with lengthy RNA sequences, such as RNA viral genomes.
Findings
Prediction algorithms based on thermodynamic models have been parallelized in the past, but for specific computer architectures that were available over two decades ago. The Vienna RNA package was originally parallelized [1] for a target architecture (an IBM RS/6000) popular in the early 1990’s. Subsequently [2,3], this work was extended to a firstgeneration clusterlike parallel computer (the Intel Delta) with an early messagepassing framework, which allows the processors to exchange only coarsegrained information. Similarly, other researchers [4] parallelized a minimal free energy (MFE) folding algorithm for two architectures (a Cray YMP vector machine and MasPar MP2 SIMD mesh) popular through the 1990’s.
Since then, parallel computing technology has changed substantially, and these previous parallel implementations are not compatible with current architectures. The current execution model includes multiple sockets of multicore processors, deep memory hierarchies with several levels of cache, and the availability of gigabytes of memory. New parallel implementations are needed to take full advantage of these new platforms.
We present the first implementation of RNA secondary structure prediction by thermodynamic optimization for modern multicore computers. Our motivation is to reduce the prediction time for long RNA sequences on modern parallel platforms, including contemporary laptops and desktops, because sequential algorithms still take many minutes to produce a single MFE structure. With the increasing prevalence of multicore desktop machines, reduction in runtime will be particularly valuable to researchers working with long RNA sequences, such as RNA viral genomes.
We compare our program GTfold with two of the most widelyused sequential programs, UNAfold [5] and RNAfold [1,6] which also run on desktop machines. The purpose of our comparisons is to demonstrate to RNA researchers that GTfold’s runtime improvements are easily accessible from their desktop computer. Our program GTfold achieves the same minimum free energy (MFE) accuracy as UNAfold and RNAfold in appreciably less time on modern multicore computers. With only four cores, GTfold folds an HIV genome, a sequence of nearly 10,000 nucleotides, in just 3.38 minutes versus the sequential running times of RNAfold (6.05 min.) and UNAfold (37.32 min.), a significant reduction in running time which further drops to below two minutes on eight or more cores.
In addition to our findings on runtime and accuracy given below, we summarize the algorithmic and implementation differences between these three software packages. These findings explain why, though they are based on the same basic model, they often produce different MFE structures. This information clarifies why the “same” programs can produce different outcomes.
Implementation
Like UNAfold and RNAfold, we implement the classic dynamic programming algorithm for free energy minimization [7] based on the same general nearest neighbor thermodynamic model (NNTM) [8,9]. The optimal solution for a given sequence, that is the MFE and associated secondary structure, is computed from optimal solutions for smaller subproblems. The algorithm first fills various one and twodimensional tables, whose entries correspond to the MFE scores for different subsequences. It then computes an optimal structure for the full sequence by performing a traceback through these tables. For detailed descriptions of the data dependencies among various elements of the tables and the algorithm in general, see [7,10].
In general, these dependencies leave open various options regarding the order in which the cells of the tables are calculated. As we will explain, the order used by GTfold is ideal for enabling a finegrained parallelism well suited for multicore and other shared memory parallel machines. However, when run sequentially on one core, this approach results in a larger number of memory accesses compared with the best sequential approach. Hence, it allows better concurrency on multicore machines at the expense of singlecore performance.
More specifically, let _{s1}_{s2}…_{sn}be an RNA sequence of length n. The MFE score of the sequence is computed by filling a 1×n array whose values depend on multiple n×n “helper” arrays which are themselves interdependent. The (i,j) entries of these helper arrays, where i<j, holds the score of the minimum free energy structure for subsets of possible structures on the subsequence _{si}_{si + 1}…_{sj}.
To describe the dependencies relevant to parallelization that govern the elements of these tables, it suffices to consider a single n×nmatrix and assume, for the moment, that its entries hold the optimal score for the set of possible structures on the subsequence _{si}_{si + 1}…_{sj}. The standard assumption of nested base pairings (i.e. lack of pseudoknots) means that the optimal structure for _{si}_{si + 1}…_{sj} depends only on subsequences of the form , where . Hence, as illustrated in Figure 1, the point (i,j) is dependent only on points that are both below and to the left.
Consider computing the optimal score for subsequences of length k=j−i + 1 for some 1≤k≤n. Since these particular computations are independent of each other, they can be performed in parallel. As implemented in GTfold, the optimal structures for subsequences of length k are computed in parallel from k=1, the main diagonal, to k=n, the optimal score for the full sequence.
Other orderings, such as columnwise or rowwise, also cover the whole computation space without violating the algorithm’s dependencies. For example, RNAfold computes the dynamic programming tables in rowmajor order, which is cachefriendly (for programs written in C/C++, like RNAfold) and has small average memory access time, but this approach is inherently sequential.
We implement the shared memory parallelism necessary for our approach using OpenMP [11] (see Availability and requirements for details). Using OpenMP allows all threads to access and to update values in a single set of arrays.
As described above, our parallel implementation exploits a diagonal access of table entries, allowing for concurrent computation via finegrained parallelism but incurring more cache misses (and hence, a larger average memory access time). As is often necessary when engineering parallel algorithms [12], we accept slower single processor performance, so that our code will run faster on the multiprocessor machines that are becoming ubiquitous. Further implementation details can be found in Mathuriya et. al 2009 [13].
Running time
The results shown here and in the rest of this section are based on GTfold (version 2.0), RNAfold distributed with the Vienna package (version 1.8.5), and UNAfold (version 3.8). When discussing running time, the number of cores is a parameter for a parallel algorithm so we refer to GTfold on n cores as GTfold(n). Running time comparisons are based on a complete HIV1 genome [GenBank:K03454] of length 9,719 nucleotides. Further details on materials and methods, including command line arguments and computing architectures, are provided in Additional file 1: Appendix 1. (Recall that previous parallelized prediction programs do not run on modern multicore machines. Because our primary interest was to evaluate the performance of GTfold on these current architectures, we did not include these previous parallelized programs in our study.)
Our running time results for UNAfold and RNAfold versus GTfold(n) on n=1,2,4,8,16 cores are given in Figure 2. Each program predicted an MFE structure (and only that) for the HIV1 genome, and MFE calculations also on a set of 22 complete HIV genomes of similar length (accession numbers given in Additional file 1: Appendix 1). We also document how the runtime scales with respect to length up to that point by graphing prediction times for different length subsequences of the HIV1 genome.
Additional file 1. Appendix 1Materials and Methods. Appendix 1 gives additional details on the sequences, software, commands, and machines used in the accuracy and running time analyses.
Format: PDF Size: 26KB Download file
This file can be viewed with: Adobe Acrobat Reader
We see that GTfold(n) is always faster than UNAfold, and significantly so for the full genome, which UNAfold computes in 37 minutes versus 3.38 minutes for GTfold(4). The relative running time of GTfold versus RNAfold depends on the number of cores used by GTfold. RNAfold computes an MFE structure for the full genome in 6.06 minutes while the running times for GTfold(n) are 12.44, 6.64, 3.38, 1.89, and 1.54 minutes respectively for n=1,2,4,8,16 cores. Hence, GTfold(1) is slower than RNAfold while the running time of GTfold(2) and RNAfold are similar (to within 10%), and GTfold on four or more cores runs in strictly less time than RNAfold.
The reproducibility of these times was confirmed by comparing UNAfold, RNAfold, and GTfold(n) (for n=1,2,4,8,16), across the full set of 22 HIV genomes (data not shown).
Accuracy
We also demonstrate that GTfold’s reduced running time does not come at the expense of prediction accuracy, relative to UNAfold and RNAfold, as measured against ribosomal structures obtained by comparative sequence analysis [14]. We find no meaningful differences in the average prediction accuracy, which is only around 40%, of all three programs across hundreds of ribosomal sequences. This confirms that in general GTfold predicts optimal structures as accurately as UNAfold and RNAfold, while simultaneously illustrating the importance of ongoing efforts to improve RNA secondary structure prediction.
Since GTfold(n) always returns the same structure, we do not specify the number of cores when discussing accuracy. The reference secondary structures for 223 16S and 55 23S ribosomal sequences were downloaded from the Comparative RNA Web site [15] (database queries given in Additional file 1: Appendix 1).
Our prediction accuracy results for UNAfold and RNAfold versus GTfold are given in Figure 3. We assess prediction accuracy against the structures for ribosomal sequences from the Comparative RNA Web site [15] using the two measures of sensitivity and selectivity as in [16]. These values are calculated based on counting base pairs in the two structures in one of three categories. True positives (TP) occur in both the comparative and predicted structure, false negatives (FN) occur only in the comparative structure, while false positives (FP) occur only in the predicted structure. Gardner & Giegerich further subdivide the FP category to account for compatible base pairs, denoted ξ, that could exist in the comparative structure and hence “can be considered neutral with respect to algorithm accuracy.” Thus,
Figure 3. Sensitivity vs. selectivity. Figure 3 plots selectivity against sensitivity for GTfold, RNAfold, and UNAfold on 223 16S sequences and for 55 23S sequences. The gray circles are (selectivity, sensitivity) pairs for an individual sequence, while the black dot shows the (average selectivity, average sensitivity) for a given method on a given class of sequences.
We find that all three programs produce structures with a wide range of sensitivity and selectivity values, with nearly indistinguishable distributions of accuracy. Each program has an average of only ∼40% for the 223 16S ribosomal sequences and the 55 23S ones. Given the high variance within programs (from a low of ∼10% to a high of ∼80%) versus the low variance between programs (less than 2% average difference), we find no meaningful difference in the accuracy of MFE predictions for GTfold versus UNAfold or RNAfold. (Further details are given in Additional file 2: Appendix 2.) This is certainly not to say that the three programs always predict the same optimal structure, in fact our experience indicates that the opposite is often true (particularly for UNAfold). As discussed below, this is a result of the subtle but crucial differences in the optimization criteria used by the different programs.
Additional file 2. Appendix 2Accuracy Comparison. Appendix 2 gives a more detailed analysis of the relative accuracy of the three prediction methods.
Format: PDF Size: 14KB Download file
This file can be viewed with: Adobe Acrobat Reader
Differences between programs
Finally, we address the fact that these three programs, which all implement the same basic algorithm using the same general NNTM, rarely return the same MFE structure — and in some cases return very different structures. This is due to the small but crucial differences in the thermodynamic and algorithmic details between GTfold, RNAfold, and UNAfold. The lack of robustness, or “illconditioning,” is a wellknown aspect [17] of this particular optimization problem; slight changes in the optimization criteria can result in drastically different MFE structures.
Our analysis of GTfold’s accuracy versus RNAfold or UNAfold necessitated a detailed comparison of the different optimization criteria of the three programs. Hence, here we summarize the details that currently result in different optimal structures while noting that these differences do not have a significant effect on prediction accuracy between programs. This information should be useful to other researchers who also find the different outcomes for these “same” programs perplexing.
The optimality criteria of GTfold and RNAfold differ simply in their thermodynamic parameter values, specifically for some special cases of internal, bulge and hairpin loops and in the number of significant digits used in the entropic penalty assigned to different lengths of internal, bulge, and hairpin loops. To address this difference, GTfold includes a rnafold option, which uses the RNAfold parameters. We have confirmed that under this option GTfold predicts the same MFE structures as RNAfold on all of our 16S and 23S sequences and on a set of 500 randomly generated sequences of length 1500.
The optimality criteria of GTfold and UNAfold differ in a number of ways. Again, there are differences in the thermodynamic parameters, but there are now also algorithmic differences, which effect the treatment of all types of loops as well as the stacked pairs. The most notable parameter differences are the treatment of GU pairs in stacks and small symmetric internal loops. The parameters used by UNAfold differ from the Nearest Neighbor Database (NNDB) [18] values in that various substructures involving GU pairs are assigned “infinite” energy. According to the UNAfold FAQ [19], these are instead treated as 1×1 and 2×2 internal loops, which are intended to be interpreted as base pairs. Nonetheless, it is our experience that this difference alters the optimization in small but significant ways. The algorithmic differences are fourfold:
1. UNAfold handles 2×3 internal loops as a special case.
2. UNAfold considers terminal mismatches in multiloops and external loops while GTfold instead considers pairs of 3’ and 5’ dangling ends.
3. UNAfold performs a prefiltering that prohibits from consideration base pairs without at least one possible neighboring pair.
4. UNAfold and GTfold can take different traceback paths producing different optimal structures with the same MFE.
To address the parameter differences and the first three algorithmic differences, GTfold has a unafold option. We have confirmed that under this option GTfold predicts a structure with the same MFE as UNAfold on all of our 16S and 23S sequences and on set of 500 randomly generated sequences of length 1500. We note that, due to differences in traceback, GTfold with this option does not always produce a structure identical to UNAfold despite obtaining the same minimum free energy.
User Options
In addition to basic MFE calculation, GTfold has a number of user options. One set of options controls what information is calculated and the level of detail in the output. As is now standard, in addition to the MFE value and optimal structure, GTfold can provide a loopbyloop energy decomposition, suboptimal structures [20] within a specified range of the MFE, and base pair probabilities via the partition function. Ongoing feature development includes performing stochastic backtracking to produce sets of structures sampled according to the Boltzmann distribution [21].
Additionally, GTfold provides several options for modifying the optimization criterion used. As with other programs, users can provide their own parameter sets for the NNTM, change the calculation of energies associated with the ends of helices in multiloops and external loops (treatment of dangling end energies and terminal mismatches), or constrain GTfold to consider only structures that contain (or that do not contain) particular base pairs or single stranded regions. It is also possible to provide experimental data from “SHAPE” experiments (Selective 2’Hydroxyl Acylation analyzed by Primer Extension [22]), to be used in the MFE optimization.
Finally, GTfold supports the unafold and rnafold options, under which it uses the same optimization criteria as UNAfold or as RNAfold, while still achieving the same running time advantages when multiple cores are available.
Availability and requirements
GTfold is freely available as open source at gtfold.sourceforge.net. Support for parallel threads requires OpenMP (which is part of current standard C compiler packages, such as GCC). We implemented shared memory parallelism using the “omp for” directive of the OpenMP[11] interface, version 3.0 (with GCC version 4.4). Further instructions, documentation, and details on requirements are specified on the GTfold sourceforge website.
• Project name: GTfold
• Project home page: http://gtfold.sourceforge.net/ webcite
• Operating systems: Unix, Linux, Mac OS X
• Programming language: C/C++
• Other requirements: C/C++ compiler; standard UNIX tools such as autoconf, automake, make; OpenMP support in the compiler
• License: GNU GPL v3
• Any restrictions to use by nonacademics: None
Availability of supporting data
Additional details on the sequences, software, commands, and machines used in the accuracy and running time analyses are available in Additional file 1: Appendix 1. Additional file 2: Appendix 2 provides a more detailed analysis of the relative accuracy of the three prediction methods.
Competing interests
The authors declare that they have no competing interests.
Authors contributions
Current version of GTfold code was written by DAB, JA, AA, PG, and ZS under the supervision of MSS, DAB, and CEH, and optimized by DAB. Numerical data was generated by PG and MSS. Paper was written by MSS and CEH, with contributions from DAB and ZS. All authors read and approved the final manuscript.
Acknowledgements
The authors thank previous students (in chronological order) for their significant contributions to the GTfold program development: Gregory Nou, Sonny Hernandez, Amrita Mathuriya, Sainath Mallidi, Neha Jatav, and Manoj Soni. MSS, PG, ZS, DAB, SCH, and CEH were supported by National Institutes of Health grant NIH NIGMS R01 GM083621. Additionally, MSS is supported in part by the Georgia Institute of Technology School of Mathematics, ZS acknowledges support from the Danish Agency for Science, Technology, and Innovation under grant #09061856 and a travel grant from the Aarhus University Research Foundation, DAB is supported in part by NSF Grants CNS0614915 and OCI0904461, and research of CEH is supported in part by a Career Award at the Scientific Interface (CASI) from the Burroughs Wellcome Fund (BWF).
References

Hofacker IL, Fontana W, Stadler PF, Bonhoeffer LS, Tacker M, Schuster P: Fast folding and comparison of RNA secondary structures.
Monatsh Chem 1994, 125(2):167188. Publisher Full Text

Hofacker IL, Huynen MA, Stadler PF, Stolorz PE: Knowledge Discovery in RNA, Sequence Families of HIV using scalable computers. In Proc. of the 2nd Int’l Conf. on Knowledge Discovery and Data Mining. Portland, OR; 1996. PubMed Abstract

Fekete M, Hofacker IL, Stadler PF: Prediction of RNA Base Pairing Probabilities on Massively Parallel Computers.
J Computational Biology 2000, 7(12):171182. Publisher Full Text

Chen JH, Le SY, Shapiro BA, Maizel JV: Optimization of an RNA folding algorithm for parallel architectures.
Parallel Computing 1998, 24:16171634. Publisher Full Text

Markham NR, Zuker M: UNAFold: Software for Nucleic Acid Folding and Hybridization. In Bioinformatics: Structure, Function, and Applications, Volume 453 of Methods in Molecular Biology. Edited by Keith JM. Totowa, NJ: Humana Press; 2008:331.

Hofacker IL, Stadler PF: Memory Efficient Folding Algorithms for Circular RNA Secondary Structures.
Bioinformatics 2006, 22(10):11721176. PubMed Abstract  Publisher Full Text

Zuker M, Stiegler P: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information.
Nucleic Acids Res 1981, 9:133148. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Mathews DH, Sabina J, Zuker M, Turner DH: Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure.
J Mol Biol 1999, 288(5):911940. PubMed Abstract  Publisher Full Text

Mathews DH, Disney MD, Childs JL, Schroeder SJ, Zuker M, Turner DH: Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure.
Proc Natl Acad Sci USA 2004, 101(19):72877292. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Lyngsø RB, Zuker M, Pedersen CNS: Internal loops in RNA, secondary structure prediction. In Proc. of the 3rd Ann. Int’l Conf. on Computational Molecular Biology (RECOMB). Lyon, France; 1999:260267.

OpenMP Architecture Review Board: OpenMP Application Program Interface Version 3.1.
2011.

Bader DA, Moret BME, Sanders P: Algorithm Engineering for Parallel Computation. In Experimental Algorithmics, LNCS. Edited by Fleischer R. Berlin Heidelberg: SpringerVerlag; 2002:123.

Mathuriya A, Bader DA, Heitsch CE, Harvey SC: GTfold: A Scalable Multicore Code for RNA, Secondary Structure Prediction. In Proceedings of the 24th Annual ACM Symposium on Applied Computing (SAC), Computational Sciences Track. Honolulu, HI; 2009.

Gutell RR, Lee JC, Cannone JJ: The accuracy of ribosomal RNA,comparative structure models.
Curr Opin Struct Biol 2002, 12(3):301310. PubMed Abstract  Publisher Full Text

Cannone JJ, Subramanian S, Schnare MN, Collett JR, D’Souza LM, Du Y, Feng B, Lin N, Madabusi LV, Müller KM, Pande N, Shang Z, Yu N, Gutell RR: The Comparative RNA, Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs.

Gardner PP, Giegerich R: A comprehensive comparison of comparative RNA, structure prediction approaches.

Zuker M: RNA folding prediction: the continued need for interaction between biologists and mathematicians. In Some mathematical questions in biology—DNA sequence analysis (New York, 1984), Volume 17 of Lectures Math. Life Sci. Providence RI: Amer. Math. Soc; 1986:87124. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Turner DH, Mathews DH: NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure.

Zuker M: FAQs for mfold computations.
2006.

Wuchty S, Fontana W, Hofacker IL, Schuster P: Complete suboptimal folding of RNA and the stability of secondary structures.
Biopolymers 1999, 49(2):14565. PubMed Abstract  Publisher Full Text

McCaskill JS: The equilibrium partition function and base pair binding probabilities for RNA,secondary structure.
Biopolymers 1990, 29(67):110519. PubMed Abstract  Publisher Full Text

Deigan KE, Lia TW, Mathews DH, Weeks KM: Accurate SHAPEdirected RNA structure determination.
Proc Natl Acad Sci USA 2009, 106:97102. PubMed Abstract  Publisher Full Text  PubMed Central Full Text