Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Research article

Comparison of similarity-based tests and pooling strategies for rare variants

Sergii Zakharov12*, Agus Salim2 and Anbupalam Thalamuthu1*

Author Affiliations

1 Human Genetics, Genome Institute of Singapore, 60 Biopolis Street, Singapore 138672, Singapore

2 Saw Swee Hock School of Public Health, National University of Singapore, 16 Medical Drive, Singapore 117597, Singapore

For all author emails, please log on.

BMC Genomics 2013, 14:50  doi:10.1186/1471-2164-14-50

Published: 24 January 2013

Abstract

Background

As several rare genomic variants have been shown to affect common phenotypes, rare variants association analysis has received considerable attention. Several efficient association tests using genotype and phenotype similarity measures have been proposed in the literature. The major advantages of similarity-based tests are their ability to accommodate multiple types of DNA variations within one association test, and to account for the possible interaction within a region. However, not much work has been done to compare the performance of similarity-based tests on rare variants association scenarios, especially when applied with different rare variants pooling strategies.

Results

Based on the population genetics simulations and analysis of a publicly-available sequencing data set, we compared the performance of four similarity-based tests and two rare variants pooling strategies. We showed that weighting approach outperforms collapsing under the presence of strong effect from rare variants and under the presence of moderate effect from common variants, whereas collapsing of rare variants is preferable when common variants possess a strong effect. We also demonstrated that the difference in statistical power between the two pooling strategies may be substantial. The results also highlighted consistently high power of two similarity-based approaches when applied with an appropriate pooling strategy.

Conclusions

Population genetics simulations and sequencing data set analysis showed high power of two similarity-based tests and a substantial difference in power between the two pooling strategies.

Keywords:
Genetics; Similarity; Power; Multi-locus; Association analysis; Rare variants; Collapsing; Weighting