Email updates

Keep up to date with the latest news and content from BMC Proceedings and BioMed Central.

This article is part of the supplement: Genetic Analysis Workshop 16

Open Access Proceedings

Two-stage joint selection method to identify candidate markers from genome-wide association studies

Zheyang Wu1, Chatchawit Aporntewan2, David H Ballard3, Ji Young Lee4, Joon Sang Lee13 and Hongyu Zhao15*

Author Affiliations

1 Department of Epidemiology and Public Health, Yale University, 60 College Street, New Haven, Connecticut 06051, USA

2 Department of Psychiatry, Yale University, 300 George Street, New Haven, Connecticut 06511, USA

3 Program in Computational Biology and Bioinformatics, Yale University, P.O. Box 208114, New Haven, Connecticut 06520-8114, USA

4 Biostatistics Resource, Keck Laboratory, Yale University, 300 George Street, New Haven, Connecticut, USA

5 Department of Genetics, Yale University School of Medicine, 333 Cedar Street, P.O. Box 208005, New Haven, Connecticut 06520-8005, USA

For all author emails, please log on.

BMC Proceedings 2009, 3(Suppl 7):S29  doi:

Published: 15 December 2009

Abstract

The interaction among multiple genes and environmental factors can affect an individual's susceptibility to disease. Some genes may not show strong marginal associations when they affect disease risk through interactions with other genes. As a result, these genes may not be identified by single-marker methods that are widely used in genome-wide association studies. To explore this possibility in real data, we carried out a two-stage model selection procedure of joint single-nucleotide polymorphism (SNP) analysis to detect genes associated with rheumatoid arthritis (RA) using Genetic Analysis Workshop 16 genome-wide association study data. In the first stage, the genetic markers were screened through an exhaustive two-dimensional search, through which promising SNP and SNP pairs were identified. Then, LASSO was used to choose putative SNPs from the candidates identified in the first stage. We then use the RA data collected by the Wellcome Trust Case Control Consortium to validate the putative genetic factors. Balancing computational load and statistical power, this method detects joint effects that may fail to emerge from single-marker analysis. Based on our proposed approach, we not only replicated the identification of important RA risk genes, but also found novel genes and their epistatic effects on RA. To our knowledge, this is the first two-dimensional scan based analysis for a real genome-wide association study.