Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

Investigating perturbed pathway modules from gene expression data via structural equation models

Daniele Pepe* and Mario Grassi

Author Affiliations

Department of Brain and Behavioural Sciences, Medical and Genomic Statistics Unit, University of Pavia, Pavia, Italy

For all author emails, please log on.

BMC Bioinformatics 2014, 15:132  doi:10.1186/1471-2105-15-132

Published: 6 May 2014

Abstract

Background

It is currently accepted that the perturbation of complex intracellular networks, rather than the dysregulation of a single gene, is the basis for phenotypical diversity. High-throughput gene expression data allow to investigate changes in gene expression profiles among different conditions. Recently, many efforts have been made to individuate which biological pathways are perturbed, given a list of differentially expressed genes (DEGs). In order to understand these mechanisms, it is necessary to unveil the variation of genes in relation to each other, considering the different phenotypes. In this paper, we illustrate a pipeline, based on Structural Equation Modeling (SEM) that allowed to investigate pathway modules, considering not only deregulated genes but also the connections between the perturbed ones.

Results

The procedure was tested on microarray experiments relative to two neurological diseases: frontotemporal lobar degeneration with ubiquitinated inclusions (FTLD-U) and multiple sclerosis (MS). Starting from DEGs and dysregulated biological pathways, a model for each pathway was generated using databases information biological databases, in order to design how DEGs were connected in a causal structure. Successively, SEM analysis proved if pathways differ globally, between groups, and for specific path relationships. The results confirmed the importance of certain genes in the analyzed diseases, and unveiled which connections are modified among them.

Conclusions

We propose a framework to perform differential gene expression analysis on microarray data based on SEM, which is able to: 1) find relevant genes and perturbed biological pathways, investigating putative sub-pathway models based on the concept of disease module; 2) test and improve the generated models; 3) detect a differential expression level of one gene, and differential connection between two genes. This could shed light, not only on the mechanisms affecting variations in gene expression, but also on the causes of gene-gene relationship modifications in diseased phenotypes.

Keywords:
Structural equation modeling (SEM); Model generation; Pathway analysis; Perturbed models; Perturbed edges