Log on / register
Feedback | Support | My details
Open AccessHighly AccessSoftware

R/parallel – speeding up bioinformatics analysis with R

Gonzalo Vera1,2 email, Ritsert C Jansen1 email and Remo L Suppi2 email

1Groningen Bioinformatics Centre (GBiC), Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Haren, The Netherlands

2Computer Architecture and Operating Systems Department (CAOS), Universitat Autònoma de Barcelona, Bellaterra, Spain

author email corresponding author email

BMC Bioinformatics 2008, 9:390doi:10.1186/1471-2105-9-390

Published: 22 September 2008

Abstract

Background

R is the preferred tool for statistical analysis of many bioinformaticians due in part to the increasing number of freely available analytical methods. Such methods can be quickly reused and adapted to each particular experiment. However, in experiments where large amounts of data are generated, for example using high-throughput screening devices, the processing time required to analyze data is often quite long. A solution to reduce the processing time is the use of parallel computing technologies. Because R does not support parallel computations, several tools have been developed to enable such technologies. However, these tools require multiple modications to the way R programs are usually written or run. Although these tools can finally speed up the calculations, the time, skills and additional resources required to use them are an obstacle for most bioinformaticians.

Results

We have designed and implemented an R add-on package, R/parallel, that extends R by adding user-friendly parallel computing capabilities. With R/parallel any bioinformatician can now easily automate the parallel execution of loops and benefit from the multicore processor power of today's desktop computers. Using a single and simple function, R/parallel can be integrated directly with other existing R packages. With no need to change the implemented algorithms, the processing time can be approximately reduced N-fold, N being the number of available processor cores.

Conclusion

R/parallel saves bioinformaticians time in their daily tasks of analyzing experimental data. It achieves this objective on two fronts: first, by reducing development time of parallel programs by avoiding reimplementation of existing methods and second, by reducing processing time by speeding up computations on current desktop computers. Future work is focused on extending the envelope of R/parallel by interconnecting and aggregating the power of several computers, both existing office computers and computing clusters.


© 1999-2008 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.