Statistics Unit, Dalarna University, SE-781 70 Borlänge, Sweden

Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, SE-750 07 Uppsala, Sweden

Department of Genetics and Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7265, U.S.A

Abstract

A number of recent works have introduced statistical methods for detecting genetic loci that affect phenotypic variability, which we refer to as variability-controlling quantitative trait loci (vQTL). These are genetic variants whose allelic state predicts how much phenotype values will vary about their expected means. Such loci are of great potential interest in both human and non-human genetic studies, one reason being that a detected vQTL could represent a previously undetected interaction with other genes or environmental factors. The simultaneous publication of these new methods in different journals has in many cases precluded opportunity for comparison. We survey some of these methods, the respective trade-offs they imply, and the connections between them. The methods fall into three main groups: classical non-parametric, fully parametric, and semi-parametric two-stage approximations. Choosing between alternatives involves balancing the need for robustness, flexibility, and speed. For each method, we identify important assumptions and limitations, including those of practical importance, such as their scope for including covariates and random effects. We show in simulations that both parametric methods and their semi-parametric approximations can give elevated false positive rates when they ignore mean-variance relationships intrinsic to the data generation process. We conclude that choice of method depends on the trait distribution, the need to include non-genetic covariates, and the population size and structure, coupled with a critical evaluation of how these fit with the assumptions of the statistical model.

Traditional approaches to mapping genes affecting quantitative traits have focused on identifying loci for which an allelic substitution shifts the phenotype of interest in a particular direction (eg, where substituting genotype AA for AG causes the phenotype to increase on average by a certain amount). Such quantitative trait loci (QTL) can be described as mean-controlling, because they are primarily observed to affect the expected, ie, average, value of individuals’ phenotypes. It is also possible, however, to detect genetic loci whose allelic substitutions are associated with an increase in variability of the phenotype about its expected value.

Figure

**Relation between a vQTL and an epistatic interaction.** Panel (a) plots phenotype values in arbitrary units for a population of 500 outbred individuals stratified by genotype at a hypothetical vQTL. Panel (b) shows how the pattern in (a) could have arisen through a simple (mean-controlling) epistatic interaction with a second locus, possibly on another chromosome, that segregates two genotypes (black and gray)

**Relation between a vQTL and an epistatic interaction.** Panel (a) plots phenotype values in arbitrary units for a population of 500 outbred individuals stratified by genotype at a hypothetical vQTL. Panel (b) shows how the pattern in (a) could have arisen through a simple (mean-controlling) epistatic interaction with a second locus, possibly on another chromosome, that segregates two genotypes (black and gray).

The biology that explains a vQTL signal can be profound, although its nature will depend on context and may require further modeling and/or experiments to characterize even broadly. Simplistically: a vQTL alerts the researcher to the presence of unmodeled statistical interactions associated with the locus. These could include interactions of potentially high order with other genes or environmental factors, and may implicate a pivotal role for the locus in maintaining phenotypic robustness and/or variability in the face of changing environment, background genetics, and temporal progression. Yet as deep as the implications of such vQTL signals may be, a crucial practical concern is how to detect them in a manner that is powerful, reliable, and robust not only to assumptions about statistical distributions but also to known features of an experiment or population that could potentially bias inference. Herein, we separate two issues: interpretation of vQTL, and statistical detection of vQTL. Both have received a recent surge of interest in the genetics literature

What vQTL are and where to find them

A number of recent studies

Although little is known about how common variance-controlling genes are in the genome or the magnitude of their contribution to trait variation in natural populations

Detecting vQTL is a shortcut for detecting interactions

One of the most exciting developments in this area is the use of methods that identify vQTL as a means to detect interactions. Hereby, epistasis can be detected using a fast search algorithm in one-dimensional parameter space, and G×E can be detected without the need to measure the interacting environmental effect

A current challenge is that statistical methods for detecting vQTL are immature relative to those for detecting mean-controlling QTL. This is in part because estimating effects on the variance is more difficult than on the mean

Classical non-parametric methods for detecting variance heterogeneity

Conover et al.

Nonetheless, both tests (and many other such classical tests of variance heterogeneity) require that the data can be grouped into genotype classes, and they lack a natural basis for inclusion of continuous covariates. It is often necessary in GWAS and QTL studies to control statistically for non-genetic covariates such as age, body weight, and other continuous or ordered measurements, including effects that control for uneven relatedness of individuals. Indeed this can be essential to, for instance, increase power or model confounder bias. Such concerns are no less relevant in vQTL detection. The fact that methods based on Levene’s test

Full parametric modeling of mean and variance

Parametric approaches are rooted in the idea that detection and interpretation is best served through specifying a generative model of the data, in this case one that describes how effects on the mean and variance combine to produce outcomes whose statistical properties approximate those of the true sampling distribution. Sorensen

Two-stage approximations to parametric models

Visscher & Posthuma

A less parametrically justifiable but more computationally convenient approach for incorporating an arbitrary set of predictors into a test for effects both in the mean and variance is to fit first a linear model for mean effects, and then fit a separate second linear model on some function of the residuals. Recently, Perry et al.

Nonetheless, we find that when the data are sizable enough to ensure that residuals are accurately estimated, SVLM and DGLM give similar performance. To illustrate this, we performed 1,000 trials of the simulation described in Struchalin et al. _{
y
i
}∼N(_{
β
F
}
_{
F
i
} + _{
β
gF
}·_{
g
i
}
_{
F
i
}, 1), drawing genotypes as _{
g
i
}∼Bin(2,0.4), and drawing the interacting factor as _{
F
i
}∼N(0,1), with main effect _{
β
F
}=0.85 and interaction _{
β
gF
}=0.06 (using their notation). In this setting, where the sample is large and the genotypes are relatively well balanced, SVLM and DGLM produced almost identical p-values (correlation 0.9996) and the power at a 95% significance level was 0.82 for both methods, whereas Levene’s test produced substantially different p-values and reduced power (0.69). The false positive rates (FPR) were also assessed for the different methods for simulated standard normal trait values, _{
y
i
}∼N(0, 1), showing very small or no inflated FPR (Table

**Estimation method**

**Simulated distribution**

**Gaussian**

**Poisson**

n.a. = not applicable.

False positive rates for simulations with no interaction effects (10,000 individuals and 1,000 replicates).

Levene’s test

0.053

0.497

SVLM

0.063

0.389

DGLM-Gaussian

0.064

0.632

DGLM-Poisson

n.a.

0.055

The value of simultaneously estimating mean and variance

The fully parametric models mentioned above fit both the mean and the variance simultaneously. Methods that make use of a (non-iterative) two-step approach, such as SVLM, have the disadvantage that they involve conditioning on an unknown

Errors in the residuals arise because these are a combination of true residuals and the under- or over-predictions of the imperfectly estimated mean part of the model. The accuracy of estimated squared residuals, that is, the correlation between true and estimated squared residuals, is given by the “hat matrix” **H**is the matrix of linear transformation between the observed and fitted response vector (_{
h
ii
}, where _{
h
ii
} are the diagonal elements of **H**, referred to as “hat values” (or “leverages”

Extensions: dealing with genotype uncertainty

Often the genotype at a marker is not known with certainty, but is available as a probability based on surrounding marker information. This is the general case in QTL analysis (ie, linkage disequilibrium mapping), and is typical in GWAS where genotypes are imputed. In ordinary QTL analysis and GWAS, genotype probabilities can be used in place of the observed genotypes (possibly after reformulation) as predictors in a linear model of the phenotype

Extensions: incorporating population structure and polygenic effects

A common problem in GWAS is accounting for effects of population structure. We

Model misspecification, the scale of measurement, and the coefficient of variation

One of the greatest challenges in the vQTL detection is how to choose the scale of measurement on which to draw conclusions about estimated effects. When we detect a SNP whose alleles increase both the mean and the variance, should we interpret it as a vQTL with a significant marginal effect, or a mean-controlling QTL for a trait that was analyzed on the wrong scale? Suppose, for instance, that we had a cylindrical organism (such as a snake or a worm) whose body width increases with its body length, and we have a gene with a strong additive effect on body length, which is itself normally distributed. A GWAS on the volume of this cylindrical organism is likely to detect a QTL controlling both mean and the variance, despite the fact that a simpler explanation is available.

To circumvent such ambiguities, some researchers have used the coefficient of variation (CV; the standard deviation divided by the mean) to detect genotypic effects on the variance

In some cases, the trait is known to be well approximated by a distribution that has a known mean-variance relationship, such as the Poisson or gamma distribution. In such cases, it will often be preferable to model that distribution explicitly and define vQTL parameters as those that alter the variance in a way not already anticipated by concomitant effects on the mean. DGLMs provide a natural basis for such models when the known sampling distribution is member of the exponential family

To illustrate the effect of a misspecified distribution on vQTL detection, we consider the following simulation study in which vQTL-detection methods are applied to a non-normal phenotype. Let the phenotype _{
Y
i
} of individual _{
Y
i
}∼Poisson(_{
μ
i
}), where the individual’s genotype _{
g
i
} affects the phenotype through the relation _{
β
g
}=0.05. This set up is similar to that used in Struchalin et al.

Connections to related literature: relationship QTL, and breeding livestock for uniformity

One interesting possible cause of a vQTL signal arises when a QTL affecting a primary trait of interest also affects another (secondary) trait in a way that can restrict variation of the primary trait (eg, in closely related morphological phenotypes). Cheverud et al.

We further note that several semi-parametric and fully parameterized estimation methods have been developed for animal breeding purposes over the last couple of decades. These methods incorporate the additive relationships between individuals to assess the possibility of reducing phenotypic variability in breeding programs. They are not directly applicable in GWAS nor QTL analysis but the modeling approach is very similar to that of vQTL detection. A future possibility is to perform standard GWAS using estimated breeding values for variability as response. A deeper understanding of new methods for vQTL detection could therefore be obtained by relating these to the already existing literature on genetic heterogenity in animal breeding (see

Conclusion

Studies that develop statistical methods for vQTL detection, as well as ones that exemplify their use, feed a growing interest in a fascinating emerging area of complex trait genetics. We have reviewed some of the new methods for detecting vQTL, and provided commentary on their respective trade-offs. Classical group-based non-parametric methods such as Levene’s test can be robust to model misspecification but lack flexibility and the scope to include continuous covariates, genotype probabilities (eg imputed genotypes) or random polygenic effects. Parametric methods fully account for the uncertainty of fitted parameters in both the mean and the variance parts of the model, and also allow fitting of covariates and random polygenic effects in both parts, but are more sensitive to modeling assumptions. Semi-parametric or two-stage approaches can be faster but come at the price of shortcuts that in some cases can lead to bias. The choice of method depends on the trait distribution, the need to include non-genetic covariates, and the population size and structure. We advise that the assumptions of the chosen model be evaluated and compared with those of alternatives, and we expect that if this is performed in a careful manner then these methods could be of great use in both the analysis of GWAS and QTL mapping data.

Author’s contributions

Both authors wrote the text and approved the final version of the manuscript.

Acknowledgements

LR recognises financial support by the Swedish Research Council FORMAS. WV acknowledges financial support from the University of North Carolina Lineberger Comprehensive Cancer Center.