Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Software

Applications of the pipeline environment for visual informatics and genomics computations

Ivo D Dinov12, Federica Torri23, Fabio Macciardi23, Petros Petrosyan1, Zhizhong Liu1, Alen Zamanyan1, Paul Eggert14, Jonathan Pierce1, Alex Genco1, James A Knowles5, Andrew P Clark5, John D Van Horn1, Joseph Ames2, Carl Kesselman2 and Arthur W Toga12*

Author Affiliations

1 Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, Los Angeles, 90095, USA

2 Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, 90292, USA

3 Departments of Psychiatry and Human Behavior, University of California, Irvine, Irvine, 92617, USA

4 Department of Computer Science, University of California, Los Angeles, Los Angeles, 90095, USA

5 Zilkha Neurogenetic Institute, USC Keck School of Medicine, Los Angeles, 90033, USA

For all author emails, please log on.

BMC Bioinformatics 2011, 12:304  doi:10.1186/1471-2105-12-304

Published: 26 July 2011

Abstract

Background

Contemporary informatics and genomics research require efficient, flexible and robust management of large heterogeneous data, advanced computational tools, powerful visualization, reliable hardware infrastructure, interoperability of computational resources, and detailed data and analysis-protocol provenance. The Pipeline is a client-server distributed computational environment that facilitates the visual graphical construction, execution, monitoring, validation and dissemination of advanced data analysis protocols.

Results

This paper reports on the applications of the LONI Pipeline environment to address two informatics challenges - graphical management of diverse genomics tools, and the interoperability of informatics software. Specifically, this manuscript presents the concrete details of deploying general informatics suites and individual software tools to new hardware infrastructures, the design, validation and execution of new visual analysis protocols via the Pipeline graphical interface, and integration of diverse informatics tools via the Pipeline eXtensible Markup Language syntax. We demonstrate each of these processes using several established informatics packages (e.g., miBLAST, EMBOSS, mrFAST, GWASS, MAQ, SAMtools, Bowtie) for basic local sequence alignment and search, molecular biology data analysis, and genome-wide association studies. These examples demonstrate the power of the Pipeline graphical workflow environment to enable integration of bioinformatics resources which provide a well-defined syntax for dynamic specification of the input/output parameters and the run-time execution controls.

Conclusions

The LONI Pipeline environment http://pipeline.loni.ucla.edu webcite provides a flexible graphical infrastructure for efficient biomedical computing and distributed informatics research. The interactive Pipeline resource manager enables the utilization and interoperability of diverse types of informatics resources. The Pipeline client-server model provides computational power to a broad spectrum of informatics investigators - experienced developers and novice users, user with or without access to advanced computational-resources (e.g., Grid, data), as well as basic and translational scientists. The open development, validation and dissemination of computational networks (pipeline workflows) facilitates the sharing of knowledge, tools, protocols and best practices, and enables the unbiased validation and replication of scientific findings by the entire community.