Email updates

Keep up to date with the latest news and content from BMC Proceedings and BioMed Central.

This article is part of the supplement: Genetic Analysis Workshop 16

Open Access Highly Accessed Proceedings

The effect of minor allele frequency on the likelihood of obtaining false positives

Meredith E Tabangin1*, Jessica G Woo12 and Lisa J Martin12

Author Affiliations

1 Division of Biostatistics and Epidemiology, Cincinnati Children's Hospital Medical Center, 3333 Burnet Avenue, Mail Code 5041, Cincinnati, Ohio 45229, USA

2 The University of Cincinnati College of Medicine, 231 Albert Sabin Way, Cincinnati, Ohio 45267, USA

For all author emails, please log on.

BMC Proceedings 2009, 3(Suppl 7):S41  doi:10.1186/1753-6561-3-S7-S41

Published: 15 December 2009

Abstract

Determining the most promising single-nucleotide polymorphisms (SNPs) presents a challenge in genome-wide association studies, when hundreds of thousands of association tests are conducted. The power to detect genetic effects is dependent on minor allele frequency (MAF), and genome-wide association studies SNP arrays include SNPs with a wide distribution of MAFs. Therefore, it is critical to understand MAF's effect on the false positive rate.

Data from the Framingham Heart Study simulated data (Problem 3, with answers) was used to examine the effects of varying MAFs on the likelihood of false positives. Replication set 1 was used to generate 1 million permutations of case/control status in unrelated individuals. Logistic regression was used to test for the association between each SNP and myocardial infarction using an additive model. We report the number of "significant" tests by MAF at α = 10-4, 10-5, and 10-6.

Common SNPs exhibited fewer false positives than expected. At α = 10-4, SNPs with MAF 25% and 50% resulted in 69.2 [95%CI: 62.8-75.6] and 70.8 [95%CI: 61.3-80.4] false positives, respectively, compared to 100 expected. Rare SNPs exhibited more variability but did not show more false-positive results than expected by chance. However, at α = 10-4, MAF = 5% exhibited significantly more false positives (105.5 [95%CI: 81-130.1]) than MAF = 25% and 50%. Similar results were seen at the other alpha values.

These results suggest that removal of low MAF SNPs from analysis due to concerns about inflated false-positive results may not be appropriate.