Email updates

Keep up to date with the latest news and content from BMC Systems Biology and BioMed Central.

Open Access Highly Accessed Methodology article

Integrated Weighted Gene Co-expression Network Analysis with an Application to Chronic Fatigue Syndrome

Angela P Presson12, Eric M Sobel3, Jeanette C Papp3, Charlyn J Suarez3, Toni Whistler4, Mangalathu S Rajeevan4, Suzanne D Vernon45 and Steve Horvath13*

Author Affiliations

1 Biostatistics, University of California, Los Angeles, CA, USA

2 Pediatrics, University of California, Los Angeles, CA, USA

3 Human Genetics, University of California, Los Angeles, CA, USA

4 Division of Viral and Rickettsial Diseases, National Center for Zoonotic, Vector-Borne and Enteric Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA

5 Chronic Fatigue and Immune Dysfunction Syndrome (CFIDS), PO Box 220398, Charlotte, NC, USA

For all author emails, please log on.

BMC Systems Biology 2008, 2:95  doi:10.1186/1752-0509-2-95

Published: 6 November 2008

Abstract

Background

Systems biologic approaches such as Weighted Gene Co-expression Network Analysis (WGCNA) can effectively integrate gene expression and trait data to identify pathways and candidate biomarkers. Here we show that the additional inclusion of genetic marker data allows one to characterize network relationships as causal or reactive in a chronic fatigue syndrome (CFS) data set.

Results

We combine WGCNA with genetic marker data to identify a disease-related pathway and its causal drivers, an analysis which we refer to as "Integrated WGCNA" or IWGCNA. Specifically, we present the following IWGCNA approach: 1) construct a co-expression network, 2) identify trait-related modules within the network, 3) use a trait-related genetic marker to prioritize genes within the module, 4) apply an integrated gene screening strategy to identify candidate genes and 5) carry out causality testing to verify and/or prioritize results. By applying this strategy to a CFS data set consisting of microarray, SNP and clinical trait data, we identify a module of 299 highly correlated genes that is associated with CFS severity. Our integrated gene screening strategy results in 20 candidate genes. We show that our approach yields biologically interesting genes that function in the same pathway and are causal drivers for their parent module. We use a separate data set to replicate findings and use Ingenuity Pathways Analysis software to functionally annotate the candidate gene pathways.

Conclusion

We show how WGCNA can be combined with genetic marker data to identify disease-related pathways and the causal drivers within them. The systems genetics approach described here can easily be used to generate testable genetic hypotheses in other complex disease studies.