- Software
- Open access
- Published:
Computing power of quantitative trait locus association mapping for haploid loci
BMC Bioinformatics volume 10, Article number: 261 (2009)
Abstract
Background
Statistical power calculations are a critical part of any study design for gene mapping. Most calculations assume that the locus of interest is biallelic. However, there are common situations in human genetics such as X-linked loci in males where the locus is haploid. The purpose of this work is to mathematically derive the biometric model for haploid loci, and to compute power for QTL mapping when the loci are haploid.
Results
We have derived the biometric model for power calculations for haploid loci and have developed software to perform these calculations. We have verified our calculations with independent mathematical methods.
Conclusion
Our results fill a need in power calculations for QTL mapping studies. Furthermore, failure to appropriately model haploid loci may cause underestimation of power.
Background
Statistical power calculations are a critical part of any study design for gene mapping. With regards to quantitative trait locus (QTL) mapping, it is typically assumed that there are two allele at the QTL, and hence three genotypes (e.g., see Lynch and Walsh [1]). For each genotype, there is a corresponding mean quantitative value that is determined by the biometric model developed by Falconer and others [2]. To determine the mean of the heterozygote, we need a parameter known as the dominance parameter. Software to compute power for the biallelic situation has been made available in the Genetic Power Calculator, developed by Purcell et al. [3] However, for haploid organisms or loci, there is no such parameter because there are no heterozygotes. Thus, there is a different set of calculations necessary to compute power for association. There has been an extensive amount of literature written on the analysis of haploid data (see, e.g., Wu et al. [4, 5], Tulsieram et al. [6]) To our knowledge, there has been no power calculator developed for QTL mapping with haploid loci. This is an oversight, since there are several common situations in humans involving loci that are haploid or hemizygous [7, 8].
The purpose of this work is to mathematically derive the biometric model for haploid loci, and to compute power for QTL mapping when the loci are haploid.
Implementation
The program for computing power is written in C++ and may be created on any computer that has a C++ compiler, for example those computers with a Unix or Linux operating system.
After making the executable, the user types the phrase "power-ttesthap" to implement the program. The user is queried for the sample size of the study. Next, the user is asked for the significance level at which simulation power is to be computed. When this value is entered, the threshold for the t-test is computed and reported to the screen. Following this, the user is asked about the locus-specific quantitative trait heritability Q. When this value is entered, the program determines the means of the allele 0 group and the allele 1 group, as well as the variance in each group. Finally, the user is asked to specify the number of replicates for which power is computed. We recommend at least 100,000 replicates. See Figure 1 for an example of how data are entered, and see additional file 1. Below, we place an example run.
Results and discussion
Calculations of simulation power as function of Locus specific QTL heritability and increaser allele frequency
In Table 1, we provide an abbreviation of all the variables used in the Results section. In Figure 2, we plot the simulation power as a function of the Locus specific QTL heritability and the increaser allele frequency, using the ranges of Q and p that we specified in the Methods. There are several conclusions we may draw from studying this graph. The first is that, as Q increases, the power increases for any value of p. The second conclusion is that, if Q is held constant, the power differs minimally over the range of p. In fact, if we consider the set of consecutive differences ((power for Q = 0.05 and p = 0.25) – (power for Q = 0.05 and p = 0.1)), ((power for Q = 0.05 and p = 0.5) – (power for Q = 0.05 and p = 0.25)),..., ((power for Q = 0.1 and p = 0.9) – (power for Q = 0.1 and p = 0.75)), the maximum difference is 0.012, and the average difference is -0.00025. The maximum difference occurs for the difference ((power for Q = 0.07 and p = 0.25) – (power for Q = 0.07 and p = 0.1)). We note that the power was calculated using the executable file that is compiled from the material in Additional File 2.
By contrast, if we consider the set of consecutive differences ((power for Q = 0.06 and p = 0.1) – (power for Q = 0.05 and p = 0.1)), ((power for Q = 0.07 and p = 0.1) – (power for Q = 0.06 and p = 0.1)),..., ((power for Q = 0.1 and p = 0.9) – (power for Q = 0.09 and p = 0.9)), the maximum difference is 0.182 and the average difference is 0.085. The maximum difference occurs for ((power for Q = 0.06 and p = 0.75) – (power for Q = 0.05 and p = 0.75)).
Another result from this graph is that power of at least 80% may be achieved when the Locus specific QTL heritability Q is at least 0.07, suggesting that genes for QTLs with reasonable variance may be mapped even when the significance level is stringent.
Real data example – EFHC2 gene SNP rs7055196 typed in 45, X Turner Syndrome subjects
Zinn et al. genotyped 97 45, X Turner Syndrome subjects at the SNP rs7055196. They also phenotyped the subjects for Facial Affect Fear Recognition score and performed a t-test to determine whether there was a significant difference in the scores based on the allele at the rs7055196 locus [9]. They found no significant difference among the two groups (49.8 +/- 26.5 for those with A allele versus 46.2 +/-26.1 for those with G allele; p-value = 0.67; see Figure 1 of Zinn et al.).
Because there are 11 subjects with the G allele and 86 subjects with the A allele, we estimate the increaser allele frequency to be 0.87. The sample size we consider is 97. We specify a range of locus-specific QTL heritabilities from 0.005 to 0.15 in increments of 0.005. Since we are only performing a single test, we consider a significance level of 0.05. We present our results in Figure 3, where each simulation data point is created using 100,000 replicates. In that figure, we see that power increases as the heritability goes up (similar to Figure 2). Also, a heritability of 0.04 gives a simulation power of approximately 0.50. These results suggest that the heritability is lower rather than higher.
Conclusion
To our knowledge, this is the first work to determine the power of QTL mapping for haploid loci or organisms. We compute power by simulation, and our results with exact analytic power suggest that the power by simulation is highly accurate. The most common application of these results is SNP association studies of loci on the X or Y chromosomes in male subjects. Other situations where haploidy arises in humans include chromosome disorders such as 45,X Turner syndrome [7] and segmental aneuploidies such as Cri-du-Chat (5p-) syndrome [8]. Last, the power to map QTLs in haplo/diploid organisms such as Saccharomyces cerevisiae can be increased by studying organisms in the haploid state [10].
Natural extensions of this work are situations such as gene-gene interactions (epistasis of different kinds), gene-environment interactions, and multi-trait mapping. This is the subject of future research. The purpose of this work is to provide researchers with the simplest non-trivial example of power calculations for haploid data.
Availability and requirements
Project name: t-test power calculator for haploid data
Project home page: http://compgen.rutgers.edu/ttesthap/
Operating system(s): Unix Solaris, Linux
Programming language: C++
Other requirements: None.
License: None.
Any restrictions to use by non-academics: None.
References
Lynch M, Walsh B: Genetics and Analysis of Quantitative Traits. Sunderland: Sinauer Associates, Inc; 1998.
Falconer DS, Mackay TFC: Introduction to Quantitative Genetics. Fourth edition. San Francisco: Benjamin Cummings; 1996.
Purcell S, Cherny SS, Sham PC: Genetic Power Calculator: design of linkage and association genetic mapping studies of complex traits. Bioinformatics 2003, 19(1):149–150. 10.1093/bioinformatics/19.1.149
Wu RL: Mapping quantitative trait loci by genotyping haploid tissues. Genetics 1999, 152(4):1741–1752.
Wu RL, O'Malley DM, McKeand SE: Understanding the genetic architecture of a quantitative trait in gymnosperms by genotyping haploid megagametophytes. Theor Appl Genet 1999, 99(6):1031–1038. 10.1007/s001220051411
Tulsieram LK, Glaubitz JC, Kiss G, Carlson JE: Single tree genetic linkage mapping in conifers using haploid DNA from megagametophytes. Biotechnology (N Y) 1992, 10(6):686–690. 10.1038/nbt0692-686
Ford CE, Jones KW, Polani PE, De Almeida JC, Briggs JH: A sex-chromosome anomaly in a case of gonadal dysgenesis (Turner's syndrome). Lancet 1959, 1(7075):711–713. 10.1016/S0140-6736(59)91893-8
Lejeune J, Lafourcade J, Berger R, Vialatte J, Boeswillwald M, Seringe P, Turpin R: [3 Cases of Partial Deletion of the Short Arm of a 5 Chromosome.]. C R Hebd Seances Acad Sci 1963, 257: 3098–3102.
Faul F, Erdfelder E, Lang AG, Buchner A: G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods 2007, 39(2):175–191.
Zinn AR, Kushner H, Ross JL: EFHC2 SNP rs7055196 is not associated with fear recognition in 45,X Turner syndrome. Am J Med Genet B Neuropsychiatr Genet 2008, 147B(4):507–509. 10.1002/ajmg.b.30625
Acknowledgements
The authors gratefully acknowledge grant NIH NS35554 to ARZ. Also, the authors gratefully acknowledge the help of Dr. Tara Matise, who created the home page for their software.
Author information
Authors and Affiliations
Corresponding author
Additional information
Authors' contributions
DG and ARZ conceived of the study and wrote the manuscript. DG created the software program. Both authors read and approved the final manuscript.
Electronic supplementary material
12859_2009_2991_MOESM1_ESM.doc
Additional file 1: Methods for haploid power. This file provides the description of the method used for calculating power. (DOC 55 KB)
12859_2009_2991_MOESM2_ESM.tar
Additional file 2: Files for compiling power-ttesthap (executable). This file containing the information needed to produce the executable that computes haploid power for a fixed sample size. (TAR 810 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Gordon, D., Zinn, A.R. Computing power of quantitative trait locus association mapping for haploid loci. BMC Bioinformatics 10, 261 (2009). https://doi.org/10.1186/1471-2105-10-261
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/1471-2105-10-261