This article is part of the supplement: Genetic Analysis Workshop 17: Unraveling Human Exome Data
Detecting functional rare variants by collapsing and incorporating functional annotation in Genetic Analysis Workshop 17 mini-exome data
1 Department of Epidemiology and Public Health, Yale University, New Haven, CT 06520, USA
2 Hubei Bioinformatics and Molecular Imaging Key Laboratory, Huazhong University of Science and Technology, Wuhan, Hubei, China
3 Keck Laboratory, Yale University, New Haven, CT 06511, USA
BMC Proceedings 2011, 5(Suppl 9):S27 doi:10.1186/1753-6561-5-S9-S27Published: 29 November 2011
Association studies using tag SNPs have been successful in detecting disease-associated common variants. However, common variants, with rare exceptions, explain only at most 5–10% of the heritability resulting from genetic factors, which leads to the common disease/rare variants assumption. Indeed, recent studies using sequencing technologies have demonstrated that common diseases can be due to rare variants that could not be systematically studied earlier. Unfortunately, methods for common variants are not optimal if applied to rare variants. To identify rare variants that affect disease risk, several investigators have designed new approaches based on the idea of collapsing different rare variants inside the same genomic block (e.g., the same gene or pathway) to enrich the signal. Here, we consider three different collapsing methods in the multimarker regression model and compared their performance on the Genetic Analysis Workshop 17 data using the consistency of results across different simulations and the cross-validation prediction error rate. The comparison shows that the proportion collapsing method seems to outperform the other two methods and can find both truly associated rare and common variants. Moreover, we explore one way of incorporating the functional annotations for the variants in the data that collapses nonsynonymous and synonymous variants separately to allow for different penalties on them. The incorporation of functional annotations led to higher sensitivity and specificity levels when the detection results were compared with the answer sheet. The initial analysis was performed without knowledge of the simulating model.