This article is part of the supplement: Genetic Analysis Workshop 17: Unraveling Human Exome Data
Enhancing the discovery of rare disease variants through hierarchical modeling
Division of Biostatistics, Department of Preventive Medicine, University of Southern California, 2001 North Soto Street, SSB 202Q, MC 9234, Los Angeles, CA 90089-9234, USA
BMC Proceedings 2011, 5(Suppl 9):S16 doi:10.1186/1753-6561-5-S9-S16Published: 29 November 2011
Advances in next-generation sequencing technology are enabling researchers to capture a comprehensive picture of genomic variation across large numbers of individuals with unprecedented levels of efficiency. The main analytic challenge in disease mapping is how to mine the data for rare causal variants among a sea of neutral variation. To achieve this goal, investigators have proposed a number of methods that exploit biological knowledge. In this paper, I propose applying a Bayesian stochastic search variable selection algorithm in this context. My multivariate method is inspired by the combined multivariate and collapsing method. In this proposed method, however, I allow an arbitrary number of different sources of biological knowledge to inform the model as prior distributions in a two-level hierarchical model. This allows rare variants with similar prior distributions to share evidence of association. Using the 1000 Genomes Project single-nucleotide polymorphism data provided by Genetic Analysis Workshop 17, I show that through biologically informative prior distributions, some power can be gained over noninformative prior distributions.