Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Software

The Ruby UCSC API: accessing the UCSC genome database using Ruby

Hiroyuki Mishima1*, Jan Aerts23, Toshiaki Katayama4, Raoul J P Bonnal5 and Koh-ichiro Yoshiura1

Author Affiliations

1 Department of Human Genetics, Nagasaki University Graduate School of Biomedical Sciences, 1-12-4 Sakamoto, Nagasaki, Nagasaki, 852-8523, Japan

2 Faculty of Engineering – ESAT/SCD, Leuven University, Kasteelpark Arenberg 10 – bus 2446, 3001, Leuven, Belgium

3 IBBT-KULeuven Future Health Department, Leuven, Belgium

4 Database Center for Life Science, Research Organization of Information and Systems, Faculty of Engineering Bldg. 12, The University of Tokyo, 2-11-16, Yayoi, Bunkyo-ku, Tokyo, 113-0032, Japan

5 Integrative Biology Program, Fondazione Istituto Nazionale di Genetica Molecolare, via Francesco Sforza, 28-20122, Milan, Italy

For all author emails, please log on.

BMC Bioinformatics 2012, 13:240  doi:10.1186/1471-2105-13-240

Published: 21 September 2012

Abstract

Background

The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby.

Results

The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast.

The API uses the bin index—if available—when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby).

Conclusions

Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ webcite under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/ webcite.