Network-based group variable selection for detecting expression quantitative trait loci (eQTL)
1 Mathematics and Physics, School of Sciences, Tsinghua University, Beijing 100084, China
2 MOE Key Laboratory of Bioinformatics/Bioinformatics Division, TNLIST, Beijing 10084, China
3 Department of Automation, Tsinghua University, Beijing 100084, China
BMC Bioinformatics 2011, 12:269 doi:10.1186/1471-2105-12-269Published: 30 June 2011
Analysis of expression quantitative trait loci (eQTL) aims to identify the genetic loci associated with the expression level of genes. Penalized regression with a proper penalty is suitable for the high-dimensional biological data. Its performance should be enhanced when we incorporate biological knowledge of gene expression network and linkage disequilibrium (LD) structure between loci in high-noise background.
We propose a network-based group variable selection (NGVS) method for QTL detection. Our method simultaneously maps highly correlated expression traits sharing the same biological function to marker sets formed by LD. By grouping markers, complex joint activity of multiple SNPs can be considered and the dimensionality of eQTL problem is reduced dramatically. In order to demonstrate the power and flexibility of our method, we used it to analyze two simulations and a mouse obesity and diabetes dataset. We considered the gene co-expression network, grouped markers into marker sets and treated the additive and dominant effect of each locus as a group: as a consequence, we were able to replicate results previously obtained on the mouse linkage dataset. Furthermore, we observed several possible sex-dependent loci and interactions of multiple SNPs.
The proposed NGVS method is appropriate for problems with high-dimensional data and high-noise background. On eQTL problem it outperforms the classical Lasso method, which does not consider biological knowledge. Introduction of proper gene expression and loci correlation information makes detecting causal markers more accurate. With reasonable model settings, NGVS can lead to novel biological findings.