Investigating perturbed pathway modules from gene expression data via structural equation models
- Equal contributors
Department of Brain and Behavioural Sciences, Medical and Genomic Statistics Unit, University of Pavia, Pavia, Italy
BMC Bioinformatics 2014, 15:132 doi:10.1186/1471-2105-15-132Published: 6 May 2014
It is currently accepted that the perturbation of complex intracellular networks, rather than the dysregulation of a single gene, is the basis for phenotypical diversity. High-throughput gene expression data allow to investigate changes in gene expression profiles among different conditions. Recently, many efforts have been made to individuate which biological pathways are perturbed, given a list of differentially expressed genes (DEGs). In order to understand these mechanisms, it is necessary to unveil the variation of genes in relation to each other, considering the different phenotypes. In this paper, we illustrate a pipeline, based on Structural Equation Modeling (SEM) that allowed to investigate pathway modules, considering not only deregulated genes but also the connections between the perturbed ones.
The procedure was tested on microarray experiments relative to two neurological diseases: frontotemporal lobar degeneration with ubiquitinated inclusions (FTLD-U) and multiple sclerosis (MS). Starting from DEGs and dysregulated biological pathways, a model for each pathway was generated using databases information biological databases, in order to design how DEGs were connected in a causal structure. Successively, SEM analysis proved if pathways differ globally, between groups, and for specific path relationships. The results confirmed the importance of certain genes in the analyzed diseases, and unveiled which connections are modified among them.
We propose a framework to perform differential gene expression analysis on microarray data based on SEM, which is able to: 1) find relevant genes and perturbed biological pathways, investigating putative sub-pathway models based on the concept of disease module; 2) test and improve the generated models; 3) detect a differential expression level of one gene, and differential connection between two genes. This could shed light, not only on the mechanisms affecting variations in gene expression, but also on the causes of gene-gene relationship modifications in diseased phenotypes.