Statistical methods have been proposed recently to analyze longitudinal data in genetic studies. So far, little attention has been paid to examine the relationship among key factors in genetic longitudinal studies including power, the number of families or sibships, and the number of repeated measures per individual subjects.
We proposed a variance component model that extends classic variance component models for a single quantitative trait to mapping longitudinal traits. Our model includes covariate effects and allows genetic effects to vary over time. Using our proposed model, we examined the power, pedigree structures, and sample size through simulation experiments.
Our simulation results provide useful insights into the study design for genetic, longitudinal studies. For example, collecting a small number of large sibships is much more powerful than collecting a large number of small sibships or increasing the number of repeated measures, when the total number of measurements is comparable.
Longitudinal study design has been routinely used to investigate the etiology and epidemiology of complex diseases, and statistical methods for analyzing longitudinal data are well established . However, there are limited applications of longitudinal data in genetic studies.
Province and Rao  used path analysis for assessing familial aggregation in the presence of temporal trends, although their analysis did not include genetic marker information. Longitudinal studies have also been used in a few occasions for twin and adoption studies (e.g., [3-6]). However, the main purpose of those studies was to assess the heritability of a trait, instead of mapping candidate loci.
Using an ad hoc approach, Levy and colleagues  conducted a linkage scan of the Framingham Heart Study. They regress the phenotype against covariates as in a standard mixed effects model, and then treat the residuals corresponding to individual measurements as a quantitative trait in standard linkage analysis software such as SOLAR . More recently, in the Genetic Analysis Workshop 13, some participants examined two-step models and some proposed joint models . The first step in a two-step model is similar to that of Levy et al.  by fitting an "ordinary" longitudinal model without consideration for genetic markers or family structures. Then, in the second step, linkage analysis is performed on one or more statistics derived from the first step. While such two-step methods are practical and simple, they are not ideal. For example, even if the covariates have additive effects to the genetic effects, potential useful information can be lost in deriving the residuals or some summary statistics. Besides, the selection among different statistics (e.g., residuals and averages) to be used in the second stage increases the number of tests to be performed, which raises the multiple comparison issue. Also importantly, the lack of a well-defined statistical model directly associating the original phenotype to the inheritance of the markers makes it infeasible to conduct formal statistical inference. In fact, the authors in the Genetic Analysis Workshop 13  clearly pointed out that a joint approach to simultaneously estimating genetic and longitudinal model parameters is appealing, because estimates of genetic and longitudinal parameters will be mutually adjusted for one another. Thus, in this report, we consider a joint model that is related to some of the models described in . Our main objective is to use our model to examine the relationship among key factors in genetic longitudinal studies including power, the number of families or sibships, and the number of repeated measures per individual subjects.
There is a growing effort to develop mixed effects models that separate the genetic effect from environmental effects  and that incorporate temporal information . However, those models do not have simple structures to accommodate genetic and temporal interactions, or to enable us to assess the longitudinal study design in linkage analysis. This raises the computational concern and may limit the analyses that can be performed as pointed in . Hence, our idea is to use a realistic yet simple variance component model that can be used to analyze general pedigree data such as the Framingham Heart Study and that allows us to consider age specific genetic effects and related study design issues. We choose a variance component model because this type of models is well established for linkage analysis of quantitative traits (e.g., [8,12,13]).
In this section, we report our simulation results to assess the Type I error rate based on the asymptotic theory, and the power of our method in detecting linkage. We are particularly interested in the effectiveness of repeated measures in improving the power. For example, how do we determine the most cost-effective number of repeated measures? The computation was performed by a statistical software R using our own program, which are available upon request. We should note that our model and program have been used to analyze general pedigree data such as the Framingham Heart Study (to be reported in a future report), although our simulation below is focused on sibships to reduce computational burden. Nuclear families were simulated, and fully informative markers with four equally frequent alleles were generated. All parental alleles were distinguished. For the nuclear families, phenotypes were simulated only for the siblings. In all the simulations, each sib in every nuclear family has 5 measurements taken at different times. The measurement times were simulated simply as (1, 2, 3, 4, 5). A covariate was simulated from a uniform distribution between 0 and 1. For clarity, we used f(X, t) = Xβ to generate the data, where β = (β0, β1, β2)' = (1, 1, 1)' and β0, β1 and β2 are parameters for the intercept, the time and the simulated covariate in mean structure. As in related studies , we did not consider dominant effects in the simulation studies and set ( , ) = (0, 0).
Type I error rates
To evaluate the type 1 error rates of the proposed tests, we considered two different null models. The first type of null model assumes that the genetic linkage effect due to the testing QTL and the polygenic effect are both zero, that is, ( , ) = (0, 0). The second type of null model assumes there is no genetic linkage effect due to the testing QTL but there is some polygenic effect and ( , ) = (0, 1). We also simulated a measurement error from a normal distribution with the variance σ2 equal to 7 and the autocorrelation between measurements at two time points t and u for a sib equals exp(-0.5|t - u|). We considered in the analysis two choices of s(t): linear [s(t) = s0 + s1t] and quadratic [s(t) = s0 + s1t + s2t2]. We simulated 5,000 replications of 100 sib pairs.
Likelihood ratio test is used to test the null hypothesis that the genetic variance due to the testing QTL equals zero (no linkage).
We use two times the natural logarithm of the likelihood ratio as the test statistic. Its asymptotic distribution appears to be a mixture of χ2 distributions , but the degrees of freedom depend on s(t).
When s(t) is constant, the model is equivalent to the traditional variance-component model since we can consider only one independent parameter, i.e., either s0 or . In this case, the test statistic asymptotically follows . For a linear s(t), the asymptotic distribution of the test statistic appears to be . For a quadratic s(t), the asymptotic distribution of the test statistic appears to be . Because we do not have theoretical proofs for the asymptotic distributions of the test statistic, we derived critical values empirically through simulations.
In practice, we do not know the form of s(t). However, we can use the backward selection as in regression analysis by beginning with the quadratic polynomial and testing whether the coefficients are zero or not. This strategy can serve as the guide in determining the final form of s(t).
Table 1 presents the empirical type I error rates based on 5,000 simulated replications under two null models. The rejection rates in the table were obtained by computing the frequencies at which the null hypotheses were rejected at the critical values from the stated asymptotic distributions. Given that we used only 100 sib pairs, the empirical type I error rates are numerically close to the nominal significance levels.
Table 1. Type 1 error rate comparisons based on 5000 simulations of 100 sib pairs under two different types of null models. Null model A is the model simulated under no heritability due to the testing QTL and no polygenic heritability. Null model B is the model without heritability due to the testing QTL but with polygenic heritability h2 and = 1. The other underlying parameters are (β0, β1, β2) = (1, 1, 1), and (σ2, α) = (7, 0.5). The assumed s(t) is labeled as "i" for s0 + s1t, and "q" for s0 + s1t + s2t2.
To compare the power increment from larger sibships, we considered the scenarios of collecting 200 sib pairs, 400 sib pairs and 200 nuclear families with 4 siblings each so that we can assess the corresponding effects of the number of nuclear families, the size of the nuclear families, and the number of repeated measures on power. We simulated data from the following three forms of s(t): (a) s(t) = 1 + 0.1t; (b) s(t) = 1; (c) . We also generated measurement errors from a multivariate normal distribution with the variance σ2 and the within-subject autocorrelation exp(-α|t - u|) between measurements at two time points t and u. To evaluate the power, we conducted a number of experiments using various genetic models: (a) ( , , σ2, α)' = (2, 1, 7, 0.5)'; (b) ( , , σ2, α)' = (1, 1, 8, 0.5)'; (c) ( , , σ2, α)' = (0.5, 1, 8.5, 0.5)'. Note that these four parameters determine the extent of the overall genetic heritability as well as the heritability due to a specific locus under consideration.
When presenting our power assessment, we make use of a generalized heritability measure for longitudinal trait proposed by de Andrade et al. . To incorporate the serial variance components, we express the polygenic and major gene heritabilities in our model as
Table 2 displays the polygenic and major gene heritabilities used in our simulation models when different numbers of repeated measurements are used.
Table 2. The polygenic and major gene heritabilities (h2 and ) used in our simulation models
Regardless of the true form of s(t), in our estimation we assumed s(t) to be one of the following three forms: s(t) = s0, s(t)= s0 + s1t, and s(t) = s0 + s1t + s2t2 where s0 is nonnegative, and it may need to be estimated together with s1 and/or s2, depending on the choice. As stated above, one of the true s(t)'s is the logit function. This is because we want to know what happens in linkage detection when s(t) is misspecified.
To understand the gain of power as a result of more repeated measures, we examined the power using all or some of the 5 measurements for each sib. We also compared the power from our models with the power of using traditional variance component (VC) method for a single measurement. The single measure can be a measurement at a particular time point or the average of the five measurements for each sib.
Tables 3, 4, and 5 display the power in the experiments as specified above. To appreciate the incremental gain of power as the number of repeated measures increases, we compared the power estimates when we used all or some of the 5 repeated measurements. As expected, the power increases as the number of repeated measures and/or the number of families increase. However, the increment of power is not uniform, and depends on the significance level. For example, ascertaining 200 sib pairs with four repeated measures tends to yield better power than collecting 400 sib pairs with two repeated measures when there is a gene-time interaction, and vice versa when there is no gene-time interaction. The information from these tables underscores the importance to conduct the power calculation under the specific designs and significance level in order to choose the most cost effective designs.
Table 3. The power comparisons based on 500 replicates. The underlying parameters are (β0, β1, β2) = (1, 1, 1), and ( , , σ2, α) = (2, 1, 7, 0.5). The assumed s(t) is labeled as "c" for constant, "l" for s0 + s1t, and "q" for s0 + s1t + s2t2.
Table 4. The power comparisons based on 500 replicates. The underlying parameters are (β0, β1, β2) = (1, 1, 1), and ( , , σ2, α) = (1, 1, 8, 0.5). The assumed s(t) is labeled as "c" for constant, "l" for s0 + s1t, and "q" for s0 + s1t + s2t2.
Table 5. The power based on 500 replicates of 200 4-sib families. For comparison purpose with the other tables, we consider two repeated measures only. The underlying parameters are (β0, β1, β2) = (1, 1, 1) with various settings of ( , , σ2, α). The assumed s(t) is labeled as "c" for constant, "l" for s0 + s1t, and "q" for s0 + s1t + s2t2.
Tables 3 and 4 reveal serious loss of power of ignoring a gene-time interaction. For example, in Table 3 when the underlying s(t) = 1 + 0.1t, with 5 repeated measures, the power estimates by ignoring s(t) were 0.77, 0.56, 0.26, and 0.09, respectively, at significance levels 0.05, 0.01, 0.001, and 0.0001. In contrast, the respective power estimates were increased to 0.90, 0.78, 0.45, and 0.24 when we estimated s(t) from s0 + s1t. We should also note here that the fold of increase is more dramatic for a more stringent significance level. On the other hand, is there a loss of the power if we consider s(t) when there is no time-dependent genetic effort? Or, broadly, what happens to the power if the time-dependent effect is misspecified? Tables 3, 4, and 5 address these questions. As expected, the power is at its peak when the underlying time trend is correctly specified. However, even with a misspecified trend, the test based on our model is more powerful than the one using a single measure, regardless of whether it was from a particular age or the average of the same number of repeated measures. We should note that, from our experiment, the use of the average of repeated measures yields more power than the use of a single measure at a given time point. In other words, without any consideration for the cost and effectiveness, we gain power from repeated measures even with a simple approach.
Finally, Table 3, 4, and 5 reveal the substantial benefit of power as a result of ascertaining large pedigrees. Table 5 displays the power of using 200 4-siblings. The power estimates using 400 sib pairs is available in Tables 3 and 4. Clearly, whenever feasible, collecting large sibships are more effective than collecting more sibships or more repeats.
In this work, we proposed a variance component model to map candidate genes when the quantitative trait is measured repeatedly. A notable feature of our model is to accommodate a potential gene-time interaction. In the existing literature, longitudinal information on the trait is sometimes re-processed into a single trait and then the standard variance component model is applied . Agreeing with other authors, we believe it is useful to have a unified model so that formal statistical inference can be performed. This benefit is evident from the simulation reported here.
We should note that the power is low with the sample sizes that we considered when the significance level is set at 0.0001. Since our purpose is to compare the power in various design settings, the absolute level of power is not critical. This is purely to reduce the computational time for our simulation. In practice, if an 80% power is desirable, for example, both the sample size and simulation replication should be increased. Despite the fact that the longitudinal study design are very popular in epidemiological and medical research, its use is still limited in linkage analysis . Here, we only discuss a basic model to explore the potential of using longitudinal data and to investigate cost effective designs. Our model is related to, but has a simpler structure than that of de Andrade et al. . We focus on the time at which the data are collected, but different study subjects may have data available at different time points from others. We also allow a potentially general temporal trend to interact with the genetic effect. In contrast, de Andrade et al.  proposed a model that assumed an individual genetic effect at every time point, which requires a uniform time schedule for all study subjects. This is a reasonable assumption for some studies including the Framingham Heart Study, but it may become restrictive to other studies.
Clearly, many important research issues warrant further investigation. For example, we need to consider gene-gene interactions, gene-environment interactions, and more general forms of gene-time interaction and fixed effects. Other classic issues including sample selection, ascertainment bias, multiple genes, and imprinting also require further investigations.
We conducted a number of simulation studies to explore the increment of power when the number of sibships is increased, when the number of repeated measures is increased, and when the size of families is increased. While we expect that these factors enhance the power, how they do so is rather intriguing. Our results can provide useful guidance for designing a genetic, longitudinal study to balance the cost, feasibility, and power. For example, collecting a small number of families with a large sibship is more effective than collecting a comparatively large number of families with a small sibship. Collecting fewer families with more repeated measures may or may not lead to more power than collecting more families with fewer repeated measures, depending on the underlying genetic models. In general, however, the relationship between the power and design is subtle, and depends on the significance level and obviously the size of genetic effects. It is wise to conduct appropriate power simulations before a genetic, longitudinal study is carried out so that the cost, the feasibility, and power can be balanced. Software can be requested from the authors for such simulations.
Although our simulations were based on nuclear families, our model can handle general pedigrees as we have used it to analyze data from the Framingham Heart Study for which the pedigree size was, on average, 5 and ranged from 2 to 29.
The model and methods
Let y denote a quantitative trait. For convenience, we first consider one pedigree. By assuming-independence between pedigrees, it is straightforward to multiply the likelihood from multiple pedigrees.
Let i refer to the ith member in a pedigree and tij be the time when the quantitative trait is measured at the jth occasion, j = 1,...,Ti and i = 1,...,n. Consider the model:
yi(tij) = f(Xi, tij) + s(tij)γi1 + γi2 + ei(tij), (1)
where f(Xi, tij) is a function of the fixed effect Xi and time tij, s(tij) a simple parametric function to accommodate time variant genetic effects, γi1 the random effect for a major gene, γi2 the random effect for unspecified polygenic effects over the genome, and ei(tij) the measurement error, j = 1,...,Ti and i = 1,...,n. We assume that γi1, γi2, and ei are independent, although ei(tij), j = 1,...,Ti, has a within-subject correlation structure that needs to specified on a case-by-case basis. It follows:
cov(yi(t), yl(u)) = s(t)s(u)cov(γi1, γl1) + cov(γi2, γl2) + δ(i = l)σ(t, u),
where σ(t, u) is the covariance function for e(t) and e(u) and δ(i = l) is the identity indicator. In addition, the covariances of γi1 and γi2 can be partitioned into additive and dominant variances as follows:
where k1,il and k2,il represent the k coefficients of  for the probability of members i and l sharing 1 and 2 alleles, respectively, identity by decent (IBD) at the locus of interest, φ and τ are respectively the expected kinship coefficient and the expected probability of sharing 2 alleles IBD over the residual components of the genome, and are respectively the additive and dominant genetic variances at the locus of interest, and and are respectively the total additive and dominant genetic variances over the residual components of the genome.
With s(tij) = 1, without f(Xi, tij), and without repeated measures, model (1) reduces to the standard variance component model for quantitative traits. Thus, model (1) is an extension of the standard variance component model to accommodate the repeated measures with a structured gene-time interaction. The structured gene-time interaction distinguishes model (1) from the existing models (e.g. ). Although γi1 does not depend on time, the manifest of genetic effects over time is accomplished through s(t). For simplicity, model (1) does not consider time-varying polygenic effects because there is no interaction term between γi2 and time.
Parameter estimation and hypothesis testing
If we arrange the phenotype in model (1) as
y = (y1(t11),..., ,...,yi(ti1),..., ,...,yn(tn1),..., )', (2)
then its covariance matrix is
Where s(ti) = (s(ti1),..., s(tiTi)', Π = (πil)n × n, K = (k2,il)n × n, Φ = (φil)n × n, Ω = (τil)n × n, is a vector of Ti 1's, and E is a block diagonal matrix,
For example, if σ(t, u) = σ2e-α|t - u|, we have
In this work, we assume that γi1, γi2, and ei have normal distributions with mean 0. If the normality is not assumed, a generalized estimating equation approach can be adopted. However, we will not explore this approach here. For clarity, we consider a specific version of model (1). Namely, let f(Xi, tij) = β0 + tijβ1 + Xi(tij)β2, where β2 is a p-vector of parameters. In addition, assume that s(t) is a first-order polynomial function, s(t) = s0 + s1t.
β = (β0, β1, β2)' (4)
be the vector of fixed effect parameters, and
be the vector of the covariance parameters. We estimate these parameters through the restricted maximum likelihood (REML) approach introduced by Patterson and Thompson  which takes into account the loss in degrees of freedom resulting from estimating fixed effects and avoids the bias in the estimation of covariance parameters.
Note that y has a multivariate normal distribution with mean Aβ and covariance Σ, where
Now, let us consider M independent pedigrees. Let
where y(m), A(m) and Σ(m) are of the forms (2), (6), and (3) respectively for the mth pedigree, m = 1,...,M.
The REML log likelihood is given by
Maximizing L(β, θ) with respect to β gives
= (A'Σ-1A)-1 A'Σ-1Y.
Plugging into the log likelihood, we have
where P = Σ-1(I - A(A'Σ-1A)-1A'Σ-1). The REML estimator for θ is obtained by maximizing the log-likelihood l(θ). Substituting the estimator for θ into gives the REML estimator for β.
Based on the theory on matrix derivatives, we have , and . Therefore, the first-order partial derivative of the log likelihood l(θ) with respect to θ gives
and the second-order partial derivative of the log likelihood l(θ) with respect to θ gives
Denote the matrix of the negative second partial derivatives of l(θ) as
A Newton-Raphson algorithm yields
Iterate until changes in successive estimates of all parameters are sufficiently small. Let be the converged estimate of θ.
If (β*, θ*) is the vector of true parameter values, based on classical statistical theory, ( - β*, - θ*) follows asymptotically a multivariate normal distribution with mean 0. And the asymptotical covariance matrix can be estimated by I-1( , ), where I(β, θ) is the information matrix.
Linkage is tested by a likelihood ratio test by comparing the likelihood under the alternative hypothesis in which the genetic variance component due to the testing QTL is estimated with that under the null hypothesis of the genetic variance due to the testing QTL being equal to zero (no linkage). Twice the natural logarithm of the likelihood ratio of these two models may have a complex asymptotic distribution of a mixture of χ2 distributions  and what kind of asymptotic distribution depends on how s(t) is defined.
HZ contributed to the conception and design of the study, analysis and interpretation of data, and XZ contributed to the design of the study, wrote the programs, and performed the simulation analysis. Both authors have been involved in writing the manuscript and approved this final version.
This research is supported in part by grants DA017713 and DA016750 from the National Institute on Drug-Abuse.
Statistics in Medicine 1988, 7:185-198. PubMed Abstract
Levy D, DeStefano AL, Larson MG, O'Donnell CJ, Lifton RP, Gavras H, Cupples LA, Myers RH: Evidence for a gene influencing blood pressure on chromosome 17. Genome scan linkage results for longitudinal blood pressure phenotypes in subjects from the Framingham heart study.
Genetic Epidemiology 2003, (Suppl 25):18-28. Publisher Full Text
Genetics Selection Evolution 2003, 35:185-198. Publisher Full Text
American Journal of Human Genetics 1994, 54:535-543. PubMed Abstract
Biometrika 1971, 58:545-554. Publisher Full Text
Journal of the American Statistical Association 1987, 82:605-610. Publisher Full Text