Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

Mouse obesity network reconstruction with a variational Bayes algorithm to employ aggressive false positive control

Benjamin A Logsdon12, Gabriel E Hoffman2 and Jason G Mezey23*

Author Affiliations

1 Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA

2 Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, USA

3 Department of Genetic Medicine, Weill Cornell Medical College, New York, New York, USA

For all author emails, please log on.

BMC Bioinformatics 2012, 13:53  doi:10.1186/1471-2105-13-53

Published: 2 April 2012

Abstract

Background

We propose a novel variational Bayes network reconstruction algorithm to extract the most relevant disease factors from high-throughput genomic data-sets. Our algorithm is the only scalable method for regularized network recovery that employs Bayesian model averaging and that can internally estimate an appropriate level of sparsity to ensure few false positives enter the model without the need for cross-validation or a model selection criterion. We use our algorithm to characterize the effect of genetic markers and liver gene expression traits on mouse obesity related phenotypes, including weight, cholesterol, glucose, and free fatty acid levels, in an experiment previously used for discovery and validation of network connections: an F2 intercross between the C57BL/6 J and C3H/HeJ mouse strains, where apolipoprotein E is null on the background.

Results

We identified eleven genes, Gch1, Zfp69, Dlgap1, Gna14, Yy1, Gabarapl1, Folr2, Fdft1, Cnr2, Slc24a3, and Ccl19, and a quantitative trait locus directly connected to weight, glucose, cholesterol, or free fatty acid levels in our network. None of these genes were identified by other network analyses of this mouse intercross data-set, but all have been previously associated with obesity or related pathologies in independent studies. In addition, through both simulations and data analysis we demonstrate that our algorithm achieves superior performance in terms of power and type I error control than other network recovery algorithms that use the lasso and have bounds on type I error control.

Conclusions

Our final network contains 118 previously associated and novel genes affecting weight, cholesterol, glucose, and free fatty acid levels that are excellent obesity risk candidates.