Log on / register
Feedback | Support | My details
Open AccessTechnical Note

SigWin-detector: a Grid-enabled workflow for discovering enriched windows of genomic features related to DNA sequences

Márcia A Inda1 email, Marinus F van Batenburg2 email, Marco Roos3 email, Adam SZ Belloum3 email, Dmitry Vasunin3 email, Adianto Wibisono3 email, Antoine HC van Kampen2 email and Timo M Breit1 email

1Integrative Bioinformatics Unit, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, PO Box 94062, 1090 GB Amsterdam, The Netherlands

2Bioinformatics Laboratory, Academic Medical Center, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands

3Institute of Informatics, Faculty of Science, University of Amsterdam, Kruislaan 403, 1098 SJ Amsterdam, The Netherlands

author email corresponding author email

BMC Research Notes 2008, 1:63doi:10.1186/1756-0500-1-63

Published: 8 August 2008

Abstract

Background

Chromosome location is often used as a scaffold to organize genomic information in both the living cell and molecular biological research. Thus, ever-increasing amounts of data about genomic features are stored in public databases and can be readily visualized by genome browsers. To perform in silico experimentation conveniently with this genomics data, biologists need tools to process and compare datasets routinely and explore the obtained results interactively. The complexity of such experimentation requires these tools to be based on an e-Science approach, hence generic, modular, and reusable. A virtual laboratory environment with workflows, workflow management systems, and Grid computation are therefore essential.

Findings

Here we apply an e-Science approach to develop SigWin-detector, a workflow-based tool that can detect significantly enriched windows of (genomic) features in a (DNA) sequence in a fast and reproducible way. For proof-of-principle, we utilize a biological use case to detect regions of increased and decreased gene expression (RIDGEs and anti-RIDGEs) in human transcriptome maps. We improved the original method for RIDGE detection by replacing the costly step of estimation by random sampling with a faster analytical formula for computing the distribution of the null hypothesis being tested and by developing a new algorithm for computing moving medians. SigWin-detector was developed using the WS-VLAM workflow management system and consists of several reusable modules that are linked together in a basic workflow. The configuration of this basic workflow can be adapted to satisfy the requirements of the specific in silico experiment.

Conclusion

As we show with the results from analyses in the biological use case on RIDGEs, SigWin-detector is an efficient and reusable Grid-based tool for discovering windows enriched for features of a particular type in any sequence of values. Thus, SigWin-detector provides the proof-of-principle for the modular e-Science based concept of integrative bioinformatics experimentation.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.