Abstract
Background
The aim of this paper is to develop a flexible model for analysis of quantitative trait loci (QTL) in outbred line crosses, which includes both additive and dominance effects. Our flexible intercross analysis (FIA) model accounts for QTL that are not fixed within founder lines and is based on the variance component framework. Genome scans with FIA are performed using a score statistic, which does not require variance component estimation.
Results
Simulations of a pedigree with 800 F_{2 }individuals showed that the power of FIA including both additive and dominance effects was almost 50% for a QTL with equal allele frequencies in both lines with complete dominance and a moderate effect, whereas the power of a traditional regression model was equal to the chosen significance value of 5%. The power of FIA without dominance effects included in the model was close to those obtained for FIA with dominance for all simulated cases except for QTL with overdominant effects. A genomewide linkage analysis of experimental data from an F_{2 }intercross between Red Jungle Fowl and White Leghorn was performed with both additive and dominance effects included in FIA. The score values for chicken body weight at 200 days of age were similar to those obtained in FIA analysis without dominance.
Conclusion
We have extended FIA to include QTL dominance effects. The power of FIA was superior, or similar, to standard regression methods for QTL effects with dominance. The difference in power for FIA with or without dominance is expected to be small as long as the QTL effects are not overdominant. We suggest that FIA with only additive effects should be the standard model to be used, especially since it is more computationally efficient.
Background
Large genetic differences between founder breeds are utilized in experimental crosses of outbred lines, which gives a high power of detecting quantitative trait loci (QTL) even for moderately sized pedigrees. The commonly used regression model to detect QTL assumes a biallelic QTL fixed within each of the two founder lines [1]. Most traits have a substantial withinbreed heritability and we may therefore expect that some QTL are not fixed. If the QTL is not fixed within founder lines, the regression model will underestimate the QTL effect and the power to detect the QTL decreases [2]. In an earlier paper [3] we developed a flexible intercross analysis (FIA) to enhance the detection of QTL in experimental crosses of outbred lines. FIA is a variance component based model which is able to detect QTL at different degrees of fixation within founder lines. Genome scans are performed based on a score statistic in FIA, which gives a computationally efficient and statistically powerful method since it does not require estimation of variance components. The model is also flexible because it can be applied on advanced intercross lines with an arbitrary number of generations. We have shown that the power of FIA is similar to HaleyKnott (HK) regression [1] for fixed QTL and FIA is superior to HKregressions for QTL that are not fixed within founder lines. We also showed that the differences between FIA and HKregression is larger for pedigrees with small base generations than for pedigrees with large ones. However, the model was developed and tested for additive QTL only.
Other methods have previously been developed to account for withinline QTL variation. Most of these methods do not include dominance effects (e.g. [4]). Two exceptions are Knott et al. [5] and PérezEnciso et al. [6]. Knott et al. [5] developed a nested within halfsib family model that does not assume fixation of QTL alleles in the founder lines, and the number of alleles is only constrained by the number of families. This model was further developed by Kim et al. [7] for analysis of F_{2 }intercrosses and includes both line effects and halfsib family effects. Dominance is estimated in the line effect whereas the family effect is an overall allele substitution effect. This is a model specifically designed for F_{2 }intercrosses with fixed effects only and the number of estimated parameters increases with the number of halfsib families. Furthermore, the genotypic information of the dams is not included in the model and the sires are assumed to be unrelated. PérezEnciso and Varona [2] developed a mixed QTL model that accounts for line differences and withinline variation of QTL effects. In this model, which is similar to the model developed by Wang et al. [4], a fixed line effect is estimated together with a random withinline QTL variance. This model was further extended to include dominance in PérezEnciso et al. [6]. A drawback of the model is, however, the difficulty to compare estimates in different genomic locations as the total QTL variance is a combination of fixed and random effects. The method is also slow since it utilizes a derivativefree method to maximize the loglikelihood in each tested chromosome position. There is therefore a need to develop a method which is computationally efficient, includes dominance and can be applied on general pedigrees from line crosses. We may expect major genes to have considerable dominance effects [8] but this does not necessarily imply that the power of a QTL analysis will increase by including dominance effects in the statistical model. In a recent paper by Martinez [9], the power to detect a QTL having a dominance effect using a variance component (VC) model was studied. He found that the gain in power using a model with both additive and dominance effects was not substantial compared to a model with only additive effects as long as the QTL effect was not overdominant. In the simulation study performed by Martinez, noninbred fullsib families were simulated and all founder QTL allele effects were assumed to be independent. FIA is a variance component based method which models dependencies between founder QTL allele effects. This difference between FIA and the model studied by Martinez [9] implies that Martinez' results cannot be directly applied on FIA.
The aim of this paper is to extend the FIA model to include both additive and dominance effects, where this extended version is computationally efficient and possible to apply on general pedigrees from line crosses. This version of FIA is then used to test the importance of including dominance in terms of power for QTL detection. We compare the power of the model, by means of simulations, with the original version of FIA and HKregression. The model is also applied on chicken body weight at 200 days of age in an F_{2}cross between wild Red Jungle Fowl and domestic Leghorn. The HKregression model was chosen for comparison in our simulations because the assumptions of the model are simple and also because it is extensively used in QTL analysis (e.g. [1,10,11]).
Results and discussion
Simulation results for a QTL with additive and dominance effects
The performance of FIA and HKregression was studied for a simulated QTL with no dominance (Figure 1a), complete dominance (Figure 1b) and overdominance (Figure 1c). Furthermore, four different cases (Table 1) were studied by varying the fixation level within lines for a biallelic QTL. The results show for no simulated dominance that the difference in power between FIA and HKregression increases when the difference in allele frequency between founder lines increases (Figure 1a). The power of FIA with only additive effects included in the model is higher than FIA with both additive and dominance effects included. These results are very similar to the ones found in Rönnegård et al. [3]. For complete dominance (Figure 1b), we can see that FIA with only additive effects performs just as well as FIA with both additive and dominance effects included. The difference in power between FIA and HKregression is not as large as in Figure 1a but there is still a large difference when there are equal allele frequencies within founder lines, i.e. for Case 4. For the simulations with extreme overdominance, the power of FIA with only additive effects is approximately 5%, i.e. what we can expect to find by chance alone. FIA with dominance effects included performs better than FIA with only additive effects, and the differences between FIA and HKregression are small. It should also be noted that our simulations indicate that the difference in power for HKregression with or without dominance included is also small as long as the QTL effects are not overdominant.
Figure 1. Power analysis for a simulated QTL. The power to detect a QTL at a 5% significance level with HaleyKnott (HK) regression and FIA for the four simulated cases presented in Table 1, ranging from total fixation (Case 1) to equal allele frequencies in both lines (Case 4). Thick solid line – FIA with additive and dominance effects; thick dashed line – FIA with additive effects only; thin solid line – HKregression with additive and dominance effects; thin dashed line – HKregression with additive effects only. For each case, 6000 replicates were simulated and the pedigree in each replicate had four founders and 800 F_{2 }individuals. In Figure 1a, an additive QTL effect (a) of 2 and a QTL dominance effect (d) of 0 was simulated together with a residual variance of 98. In Figure 1b, a = 1 and d = 1, and in Figure 1c, a = 0 and d = 2.
Table 1. Simulated levels of fixation for the four simulated scenarios ranging from a fixed QTL (Case 1) to equal frequencies in both founder lines (Case 4)
QTL genome scan for body weight in the Red Jungle Fowl × White Leghorn F_{2 }Cross
The chicken genome was scanned for QTL affecting body weight at 200 days of age in an F_{2 }intercross between Red Jungle Fowl and White Leghorn. As previously [3,11] reported there are two QTL with large effects on chromosome 1. These two QTL give very large score values in our study also (Figure 2) and the peak values are far above the 5% genomewide significance threshold of 101.2. The significance threshold for the same data without dominance effects included in FIA was 85.6. This increase in threshold value is expected since more parameters are included in FIA with dominance. The changes in score values in the genome scan are relatively small (Figure 2) and there is only one more peak that exceeds the significance level of 101.2. This QTL is located on chromosome 27 (i.e. the third chromosome from the right in Figure 2). There are also several suggestive QTL located on chromosomes: 3, 4, 5, 11 and 28. The only one of these suggestive QTL that showed a substantial change in the score value after including dominance effects in FIA was the QTL on chromosome 4. In conclusion, the change in score values was small for FIA with or without dominance effects and the significance of the QTL were mainly affected by the difference in the genomewide significance threshold between the two models.
Figure 2. Genome scan for body weight at 200 days of age. Genome scan with score values on a log_{10 }scale. The solid curve above 0 show the score values for FIA including both additive and dominance effects. The dashed curve below 0 show the difference in log_{10 }score values to those obtained from FIA with only additive effects. The 5% genomewide significance level is shown as a dashed horizontal line and the borders between chromosomes are given as vertical dashed lines. The score statistic of the FIA model is nonnegative since it is defined as a quadratic form.
What do the results tell us about the importance of including dominance effects in FIA?
Our simulations show that the power of FIA including dominance effects is substantially higher for overdominant QTL. For QTL effects that are not overdominant the differences between the two versions of FIA are small. Hence, it is feasible to include dominance in FIA. We expect, however, that major genes having moderate dominance effects will be detected with the simpler additive version of FIA. These results are similar to the ones obtained by Martinez [9] where he showed that the power of VCbased models does not increase substantially by including dominance effects as long as the QTL effects are not overdominant. The difference in power for HKregression with or without dominance included in the model seem to be small as long as the QTL effects are not overdominant. So the importance of including dominance effects in QTL analysis seems to be a general question and is related to how often we can expect major genes to be overdominant.
Although the differences between HKregression and FIA decreases for dominant QTL effects we still have not found a case where HKregression outperforms FIA substantially in terms of QTL detection power. Regression methods are computationally faster than FIA although the latter is based on the score statistic which is easily computed. For the simulated pedigree with 800 F_{2 }individuals, including dominance in FIA gives a threefold increase in computational costs (wall clocktime) for the score statistic (eq. 12).
Including dominance also requires that the dominance IBDmatrices have been computed, which may be computationally demanding unless the IBD calculations are based on the gametic IBDs (see eq. 3). The genome scan in FIA is based on a score statistic (eq. 12) and the variance components in FIA do not need to be estimated for each position, but for QTL positions we may wish to estimate the variance components of FIA. There are then two variance components for the additive effects, two for the dominance effects (see eq. 11) and one for the residual variance. Although the VC estimates are of secondary importance in FIA, estimates of the five variance components in eq. (11) are given in the Appendix for each of the four cases in Table 1, for 120 replicates of the simulated 800 F_{2 }pedigree. Models with several variance components require a robust REML estimation algorithm to ensure convergence. Mishchenko et al. [12] recently developed a robust and efficient REML estimation algorithm for VC models including up to five variance components, which was not applied in our current study but is likely to become useful in the future.
We have previously shown that it is computationally feasible to include epistasis in FIA [3] but so far we have not tested FIA with epistasis on empirical data, and we may expect HKregression to be a useful method for detection of epistatic QTL effects (e.g. [10]) still for some time in the future. We are convinced that an important research task is to develop a computationally fast and robust version of FIA for detection of epistatic effects.
Conclusion
We have shown that FIA can be extended to include QTL dominance effects. The power of FIA is superior, or similar, to HKregression for QTL effects with dominance. The difference in power for FIA with or without dominance is small as long as the QTL effects are not overdominant. Furthermore, we expect that FIA with only additive effects included will be effective also for finding major genes having moderate dominance effects. We therefore suggest that FIA with only additive effects should be the model to use in most situations especially since it is computationally less intensive.
Methods
In this section we present the traditional single locus VC model that includes dominance effects of the QTL and where all base QTL allele effects are assumed to be uncorrelated [13,14]. Thereafter, we present our FIA model which was previously developed for additive QTL effects [3] and show how dominance can be included.
Traditional VC model including dominance QTL effects
The VC model including QTL effects with dominance is given by:
where y is the vector of individual phenotypes (length n), b is a vector of fixed effects and X is the corresponding design matrix, v is a vector of additive random individual QTL effects (length n) in position τ, d is a vector of random individual QTL effects for dominance (length n), and e is a vector of residual effects (length n). The variancecovariance matrix of y, assuming independent allelic effects in the base generation, is (e.g. [15]):
where Π is the genotype IBDmatrix (size n × n) calculated in position τ, is the corresponding genotype QTL variance for additive effects, Δ is the dominance IBDmatrix (size n × n) calculated in position τ, is the QTL variance for dominance effects, I is the identity matrix of size n × n, and is the residual variance. An element in row i and column j of Δ can be calculated directly from the gametic IBDmatrix (e.g. [16]) as:
where the values g_{ij}(k, l) are the gametic IBDs between individual i and j for the maternal/paternal alleles k and l.
Including dominance in the VC QTL model
Rönnegård and Carlborg [17] described the VC model in eq. 1 in terms of independent base generation effects, where:
Here v* is a vector of base generation allele effects and d* is a vector of dominance effects for all pairwise base allele combinations. These dominance effects are assumed to be randomly sampled from an infinite population of dominance effects with a variance of . Furthermore the random dominance effects for homozygotes and heterozygotes are assumed to be sampled from the same distribution. The incidence matrices Z and W relate individuals with their corresponding additive and dominance effects. We thereby have a variancecovariance matrix for the random effects given by:
Moreover, with this notation we have the relationships (see [17])
Hence, for a single QTL model there is no covariance between additive and dominance effects. The estimates of and may be strongly correlated, however, since the IBDvalues in Π and Δ are correlated [9].
FIA model with additive effects
FIA extends the traditional VC model to include withinline correlations of the QTL allele effects. The FIA model without dominance effects is given by [3]:
where the variancecovariance matrix of y is:
Here, Π_{I }is the genotypic IBDmatrix assuming independent QTL allele effects in the base generation and Π_{J }is the IBDmatrix that assumes fixation of QTL alleles within founder lines. Hence, the analysis using FIA requires an IBD estimation program that allows for different base generation structures. We used the same IBDmatrix estimation program as in [3], which is based on the deterministic algorithm published by [16].
FIA model with additive and dominance effects
Dominance is included in FIA by using the same linear model as in (1) but the variancecovariance matrix is not the same as in (2):
where the variancecovariance matrix of y is:
Here, Δ_{I }is the dominance IBDmatrix assuming independent QTL allele effects in the base generation and Δ_{J }is the dominance IBDmatrix that assumes fixation of QTL alleles within founder lines. The above formula for the variancecovariance matrix V was derived following the derivation of eq. (4) in Rönnegård et al. [3].
We let the variance components be independent of each other. This assumption gives the variancecovariance matrix of y as a linear function of the variance components. This is a simplification since is the same withinline correlation as and the variancecovariance matrix of y is not strictly a linear function of the variance components.
Calculating the score for the FIA model
FIA utilizes the score statistic [1820]
where D is the gradient and F is the information matrix calculated under the null hypothesis of no QTL effects, i.e. .
The elements of the gradient D of the loglikelihood function L are given by [21]:
where and . The partial derivatives of V are: , and . Furthermore, P is the projection matrix given by:
The elements of the information matrix F are given by [21]:
Calculation of genomewide significance thresholds
The significance thresholds for the genome scan were calculated by means of permutation testing (as in [3]). Residuals were calculated from a null model assuming no QTL effect. These residuals were then permuted giving a new vector ĕ. Replicates of the phenotypic data were simulated with where is the vector of fixed effects estimated from the null model y = Xb + e. For each replicate, the score statistic was calculated at every tested position (5 cM apart) along the genome using 12. The empirical distribution of the maximum score value from each replicate was used to obtain significance thresholds. 2000 replicates were simulated.
Simulation setup
In the power analyses, level of fixation within founder lines and degree of dominance were varied to evaluate the differences between FIA and HKregression. The methods were compared by their power to detect a QTL at a given position at a 5% significance level.
The structure for the base generation was designed to mimic the pedigree of a Red Jungle Fowl – White Leghorn F_{2 }Cross [11] with one Jungle Fowl male mated to three Leghorn females, and 800 F_{2 }individuals. Four different cases (Table 1) were studied by varying the fixation level within lines for a biallelic QTL. The QTL was simulated at a position having a fullyinformative marker so that the QTL alleles could be traced through the pedigree unambiguously.
The phenotype of an F_{2 }individual i was simulated with y_{i }= A_{1i }+ A_{2i }+ D_{i }+ e_{i }where A_{1i }is the QTL allele effect on the paternally inherited chromosome and A_{2i }is the QTL allele effect on the maternally inherited chromosome, D_{i }is the dominance effect and e_{i }is an iid normally distributed residual effect with a variance equal to 98. A biallelic QTL was simulated where the additive effects for the two alternative alleles were 0 and a, and the dominance effects for heterozygotes was d. The values of a and d were varied from 0 to 2.
6000 replicates were calculated for each of the four cases in Table 1 and for varying degrees of dominance.
Analysis of experimental data: Red Jungle Fowl × White Leghorn F_{2 }Cross
In a Red Jungle Fowl × White Leghorn F2 cross, we performed a full genome scan using FIA with additive and dominance effects. In this pedigree, one Red Jungle Fowl male was mated to three White Leghorn females producing 756 F_{2 }offspring with measured genotypes and phenotypes. We used an updated marker map to those reported in [11] including 439 markers (Leif Andersson, personal communication) covering chromosomes 1 to 28. We analyzed body weight at 200 days of age. In our previous study using FIA with only additive effects we found six QTL at a 5% genomwide significance. These QTL were located at: 102 cM on chromosome 1, 488 cM on chromosome 1, 32 cM on chromosome 5, 30 cM on chromosome 6, 21 cM on chromosome 27 and 35 cM on chromosome 28. The data are described in detail in [11].
Authors' contributions
LR performed most of the analysis and writing. FB calculated the IBDmatrices and added important ideas to the text. ÖC initiated the paper and was responsible for the development of the paper from the initial results to the final version of the manuscript. All authors have read and approved the final version of this paper.
Appendix
Variance components in FIA with dominance included (i.e. eq. 10) were estimated using the Fisher scoring algorithm given in Rönnegård and Carlborg [17].
For simulations under Case 1, the additive variance and the covariance within lines were similar, and the dominance variance was close to the dominance covariance within lines [see Additional File 1]. These results were expected since the correlation within lines is 1.0 in Case 1. Furthermore, the relative difference between the estimated variances and covariances increased when the simulated withinline correlation decreased from 1.0 in Case 1 to 0 in Case 4.
Additional file 1. Variance Component Estimation. Table including variance component estimation for the FIA model with dominance included.
Format: PDF Size: 31KB Download file
This file can be viewed with: Adobe Acrobat Reader
The theoretical expectation of the estimated and for fixed values of a and d depends on the level of fixation within lines (see Appendix in Rönnegård et al. [3]). For a given case in Table A1 we can see, however, that the estimated QTL variances decreases as the simulated QTL effects decreases. For a = 0 or d = 0 we do not get QTL variance estimates close to zero, which suggests that there is a bias in the estimates. This bias is likely due to the fact that the elements in the IBD matrices Π and Δ are correlated, and that it is therefore difficult to separate the additive and dominance effects in the REML estimation. In the applied Fisher scoring algorithm, each variance component was restricted to be greater or equal to 0.1 to ensure positive variance estimates. If the algorithm had not converged within 20 iterations the result was not analyzed and reported as nonconverged. There are five variance components in eq. (10) and there were a substantial number of simulations (around 15%) that did not converge. The difficulties in convergence is not a major problem in FIA, however, since the genome scan is based on a score statistic that does not require VC estimation. REML estimation for models with several variance components is a general computational problem and a robust method is described in Mishchenko et al. [12].
Acknowledgements
LR and FB gratefully acknowledge FORMAS in financing this study, and ÖC acknowledges SSF for financial support.
References

Haley C, Elsen J, Knott S: Mapping quantitative trait loci in crosses between outbred lines using least squares.
Genetics 1994, 136:11951207. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

PérezEnciso M, Varona L: Quantitative trait loci mapping in F2 crosses between outbred lines.
Genetics 2000, 155:391405. PubMed Abstract  PubMed Central Full Text

Rönegård L, Besnier F, Carlborg O: An improved method for QTL detection and identification of withinline segregation in F2 intercross designs.
Genetics 2008, 178:23152326. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Wang T, Fernando RL, Grossman M: Genetic evaluation by best linear unbiased prediction using marker and trait information in a multibreed population.
Genetics 1998, 148:507515. PubMed Abstract  PubMed Central Full Text

Knott S, Elsen J, Haley C: Methods for multiplemarker mapping of quantitative trait loci in halfsib populations.
Theoretical and Applied Genetics 1996, 93:7180. Publisher Full Text

PérezEnciso M, Fernando R, Bidanel J, le Roy P: Quantitative Trait Locus Analysis in Crosses Between Outbred Lines With Dominance and Inbreeding.
Genetics 2001, 159:413422. PubMed Abstract  PubMed Central Full Text

Kim JJ, Zhao H, Thomsen H, Rothschild M, Dekkers J: Combined linecross and halfsib QTL analysis of crosses between outbred lines.
Genetical Research 2005, 85:235248. PubMed Abstract  Publisher Full Text

Kacser H, Burns J: The control of flux.
Biochemical Society Transactions 1995, 23:341366. PubMed Abstract

Martinez V: Further insights of the variance component method for detecting QTL in livestock and aquacultural species: relaxing the assumption of additive effects.
Genet Sel Evol 2008, 40(6):585606. PubMed Abstract  PubMed Central Full Text

Carlborg O, Jacobsson L, Åhgren P, Siegel P, Andersson L: Epistatsis and the release of genetic variation during longterm selection.
Nature Genetics 2006, 38:41820. PubMed Abstract  Publisher Full Text

Kerje S, Carlborg O, Jacobsson L, Schutz K, Hartmann C, Jensen P, Andersson L: The twofold difference in adult size between the red junglefowl and White Leghorn chickens is largely explained by a limited number of QTLs.
Animal Genetics 2003, 34:264274. PubMed Abstract  Publisher Full Text

Mishchenko K, Rönnegård L, Holmgren S, Mishchenko V: Assessing a multiple QTL search using the variance component model. In Doctoral Thesis:Numerical Algorithms for Optimization Problems in Genetical Analysis. Edited by Mishchenko K. Mälardalen University, Sweden; 2008:119.

Fernando RL, Grossman M: Markerassisted selection using best linear unbiased prediction.
Genetics Selection Evolution 1989, 21:467477. Publisher Full Text

Goldgar DE: Multipoint analysis of human quantitative genetic variation.
American Journal of Human Genetics 1990, 47:957967. PubMed Abstract  PubMed Central Full Text

Xu S: Computation of the full likelihood function for estimating variance at a quantitative trait locus.
Genetics 1996, 144:19511960. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

PongWong R, George A, Wooliams J, Haley C: A simple and rapid method for calculating identitybydescent matrices using multiple markers.
Genet Sel Evol 2001, 33(5):453471. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Rönnegård L, Carlborg O: Separation of Base Allele and Sampling Term Effects Gives New Insights in Variance Component QTL Analysis.
BMC Genetics 2007, 8:1. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Cox D, Hinkley C: Theoretical Statistics. Sunderland, USA: Sinauer Associates, Inc.; 1974.

Tang HK, Siegmund D: Mapping quantitative trait loci in oligogenic models.
Biostatistics 2001, 2:147162. PubMed Abstract  Publisher Full Text

Putter H, Sandkuijl L, van Houwelingen J: Score test for detecting linkage to quantitative traits.
Genetic Epidemiology 2002, 22:345355. PubMed Abstract  Publisher Full Text

Lynch M, Walsh B: Genetics and analysis of Quantitative Traits. Sunderland, USA: Sinauer Associates, Inc.; 1998.