Email updates

Keep up to date with the latest news and content from BMC Proceedings and BioMed Central.

This article is part of the supplement: Beyond the Genome 2012

Open Access Open Badges Poster presentation

CLIA-certified next-generation sequencing analysis in the cloud

Ying Zhang1*, Jesse Erdmann1, John Chilton1, Getiria Onsongo1, Matthew Bower23, Kenny Beckman4, Bharat Thyagarajan5, Kevin Silverstein1, Anne-Francoise Lamblin1 and the Whole Galaxy Team at MSI1

  • * Corresponding author: Ying Zhang

Author affiliations

1 Research Informatics Support System, Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN 55455, USA

2 Division of Genetics and Metabolism, University of Minnesota, Minneapolis, MN 55455, USA

3 Molecular Diagnostics Laboratory, University of Minnesota Medical Center-Fairview, University of Minnesota, Minneapolis, MN 55455, USA

4 BioMedical Genomics Center, University of Minnesota, Minneapolis, MN 55455, USA

5 Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA

For all author emails, please log on.

Citation and License

BMC Proceedings 2012, 6(Suppl 6):P54  doi:10.1186/1753-6561-6-S6-P54

The electronic version of this article is the complete one and can be found online at:

Published:1 October 2012

© 2012 Zhang et al; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Poster presentation

The development of next-generation sequencing (NGS) technology opens new avenues for clinical researchers to make discoveries, especially in the area of clinical diagnostics. However, combining NGS and clinical data presents two challenges: first, the accessibility to clinicians of sufficient computing power needed for the analysis of high volume of NGS data; and second, the stringent requirements of accuracy and patient information data governance in a clinical setting.

Cloud computing is a natural fit for addressing the computing power requirements, while Clinical Laboratory Improvement Amendments (CLIA) certification provides a baseline standard for meeting the demands on researchers in working with clinical data. Combining a cloud-computing environment with CLIA certification presents its own challenges due to the level of control users have over the cloud environment and CLIA's stability requirements. We have bridged this gap by creating a locked virtual machine with a pre-defined and validated set of workflows. This virtual machine is created using our Galaxy VM launcher tool to instantiate a Galaxy [ webcite] environment at Amazon with specific versions of the tools used in the workflow. The VM launcher tool can reliably recreate the same virtual machine on several cloud environments. Once a baseline virtual machine is created, the tool can launch any number of clones to analyze samples in parallel. We describe herein a pilot project as an example of a working clinical analysis pipeline. In order to validate the clinical diagnosis of diseases with a genetic cause using NGS data, patient samples were collected by Dr Bharat Thyagarajan and staff at the Molecular Diagnostics Laboratory, University of Minnesota medical center-Fairview. The patient samples were analyzed using customized hybrid-capture bait libraries to boost read coverage in low-coverage regions, followed by targeted enrichment sequencing at the BioMedical Genomics Center. The NGS data is imported to a tested Galaxy single nucleotide polymorphism (SNP) detection workflow in a locked Galaxy virtual machine on Amazon's Elastic Compute Cloud (EC2). This project illustrates our ability to carry out CLIA-certified NGS analysis in the cloud, and will provide valuable guidance in any future implementation of NGS analysis involving clinical diagnosis.