This article is part of the supplement: Genetic Analysis Workshop 16
Comparison between the stochastic search variable selection and the least absolute shrinkage and selection operator for genome-wide association studies of rheumatoid arthritis
Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, USA
BMC Proceedings 2009, 3(Suppl 7):S21 doi:Published: 15 December 2009
Because multiple loci control complex diseases, there is great interest in testing markers simultaneously instead of one by one. In this paper, we applied two model selection algorithms: the stochastic search variable selection (SSVS) and the least absolute shrinkage and selection operator (LASSO) to two quantitative phenotypes related to rheumatoid arthritis (RA).
The Genetic Analysis Workshop 16 data includes 2,062 unrelated individuals and 545,080 single-nucleotide polymorphism markers from the Illumina 550 k chip. We performed our analyses on the cases as the quantitative phenotype data was not provided for the controls. The performance of the two algorithms was compared. Using sure independence screening as the prescreening procedure, both SSVS and LASSO give small models. No markers are identified in the human leukocyte antigen region of chromosome 6 that was shown to be associated with RA. SSVS and LASSO identify seven common loci, and some of them are on genes LRRC8D, LRP1B, and COLEC12. These genes have not been reported to be associated with RA. LASSO also identified a common locus on gene KTCD21 for the two phenotypes (marker rs230662 and rs483731, respectively).
SSVS outperforms LASSO in simulation studies. Both SSVS and LASSO give small models on the RA data, however this depends on model parameters. We also demonstrate the ability of both LASSO and SSVS to handle more markers than the number of samples.