Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Software

Confero: an integrated contrast data and gene set platform for computational analysis and biological interpretation of omics data

Leandro Hermida1*, Carine Poussin1, Michael B Stadler234, Sylvain Gubian1, Alain Sewer1, Dimos Gaidatzis234, Hans-Rudolf Hotz234, Florian Martin1, Vincenzo Belcastro1, Stéphane Cano1, Manuel C Peitsch1 and Julia Hoeng1*

Author Affiliations

1 Philip Morris International Research & Development, Quai Jeanrenaud 5, CH-2000 Neuchatel, Switzerland

2 Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, CH-4058 Basel, Switzerland

3 University of Basel, Petersplatz 10, CH-4003 Basel, Switzerland

4 Swiss Institute of Bioinformatics, Maulbeerstrasse 66, CH-4058 Basel, Switzerland

For all author emails, please log on.

BMC Genomics 2013, 14:514  doi:10.1186/1471-2164-14-514

Published: 29 July 2013

Abstract

Background

High-throughput omics technologies such as microarrays and next-generation sequencing (NGS) have become indispensable tools in biological research. Computational analysis and biological interpretation of omics data can pose significant challenges due to a number of factors, in particular the systems integration required to fully exploit and compare data from different studies and/or technology platforms. In transcriptomics, the identification of differentially expressed genes when studying effect(s) or contrast(s) of interest constitutes the starting point for further downstream computational analysis (e.g. gene over-representation/enrichment analysis, reverse engineering) leading to mechanistic insights. Therefore, it is important to systematically store the full list of genes with their associated statistical analysis results (differential expression, t-statistics, p-value) corresponding to one or more effect(s) or contrast(s) of interest (shortly termed as ” contrast data”) in a comparable manner and extract gene sets in order to efficiently support downstream analyses and further leverage data on a long-term basis. Filling this gap would open new research perspectives for biologists to discover disease-related biomarkers and to support the understanding of molecular mechanisms underlying specific biological perturbation effects (e.g. disease, genetic, environmental, etc.).

Results

To address these challenges, we developed Confero, a contrast data and gene set platform for downstream analysis and biological interpretation of omics data. The Confero software platform provides storage of contrast data in a simple and standard format, data transformation to enable cross-study and platform data comparison, and automatic extraction and storage of gene sets to build new a priori knowledge which is leveraged by integrated and extensible downstream computational analysis tools. Gene Set Enrichment Analysis (GSEA) and Over-Representation Analysis (ORA) are currently integrated as an analysis module as well as additional tools to support biological interpretation. Confero is a standalone system that also integrates with Galaxy, an open-source workflow management and data integration system. To illustrate Confero platform functionality we walk through major aspects of the Confero workflow and results using the Bioconductor estrogen package dataset.

Conclusion

Confero provides a unique and flexible platform to support downstream computational analysis facilitating biological interpretation. The system has been designed in order to provide the researcher with a simple, innovative, and extensible solution to store and exploit analyzed data in a sustainable and reproducible manner thereby accelerating knowledge-driven research. Confero source code is freely available from http://sourceforge.net/projects/confero/ webcite.

Keywords:
Gene expression; Contrast data; Gene set; Gene set enrichment; Omics; Microarray; Next-generation sequencing; Reproducible research system; Knowledge acquisition