Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

Detecting disease-associated genes with confounding variable adjustment and the impact on genomic meta-analysis: With application to major depressive disorder

Xingbin Wang1, Yan Lin1, Chi Song1, Etienne Sibille2* and George C Tseng134*

Author Affiliations

1 Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261, USA

2 Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA 15260, USA

3 Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA 15261, USA

4 Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15260, USA

For all author emails, please log on.

BMC Bioinformatics 2012, 13:52  doi:10.1186/1471-2105-13-52

Published: 29 March 2012

Abstract

Background

Detecting candidate markers in transcriptomic studies often encounters difficulties in complex diseases, particularly when overall signals are weak and sample size is small. Covariates including demographic, clinical and technical variables are often confounded with the underlying disease effects, which further hampers accurate biomarker detection. Our motivating example came from an analysis of five microarray studies in major depressive disorder (MDD), a heterogeneous psychiatric illness with mostly uncharacterized genetic mechanisms.

Results

We applied a random intercept model to account for confounding variables and case-control paired design. A variable selection scheme was developed to determine the effective confounders in each gene. Meta-analysis methods were used to integrate information from five studies and post hoc analyses enhanced biological interpretations. Simulations and application results showed that the adjustment for confounding variables and meta-analysis improved detection of biomarkers and associated pathways.

Conclusions

The proposed framework simultaneously considers correction for confounding variables, selection of effective confounders, random effects from paired design and integration by meta-analysis. The approach improved disease-related biomarker and pathway detection, which greatly enhanced understanding of MDD neurobiology. The statistical framework can be applied to similar experimental design encountered in other complex and heterogeneous diseases.