This article is part of the supplement: Genetic Analysis Workshop 16
Case-control genome-wide association study of rheumatoid arthritis from Genetic Analysis Workshop 16 using penalized orthogonal-components regression-linear discriminant analysis
1 Department of Statistics, Purdue University, 150 North University Street, West Lafayette, IN 47907, USA
2 Department of Foods and Nutrition, Purdue University, 700 West State Street, West Lafayette, IN 47907, USA
BMC Proceedings 2009, 3(Suppl 7):S17 doi:Published: 15 December 2009
Currently, genome-wide association studies (GWAS) are conducted by collecting a massive number of SNPs (i.e., large p) for a relatively small number of individuals (i.e., small n) and associations are made between clinical phenotypes and genetic variation one single-nucleotide polymorphism (SNP) at a time. Univariate association approaches like this ignore the linkage disequilibrium between SNPs in regions of low recombination. This results in a low reliability of candidate gene identification. Here we propose to improve the case-control GWAS approach by implementing linear discriminant analysis (LDA) through a penalized orthogonal-components regression (POCRE), a newly developed variable selection method for large p small n data. The proposed POCRE-LDA method was applied to the Genetic Analysis Workshop 16 case-control data for rheumatoid arthritis (RA). In addition to the two regions on chromosomes 6 and 9 previously associated with RA by GWAS, we identified SNPs on chromosomes 10 and 18 as potential candidates for further investigation.