Abstract
Background
The summary measure approach (SMA) is sometimes the only applicable tool for the analysis of repeated measurements in medical research, especially when the number of measurements is relatively large. This study aimed to describe techniques based on summary measures for the analysis of linear trend repeated measures data and then to compare performances of SMA, linear mixed model (LMM), and unstructured multivariate approach (UMA).
Methods
Practical guidelines based on the least squares regression slope and mean of response over time for each subject were provided to test time, group, and interaction effects. Through Monte Carlo simulation studies, the efficacy of SMA vs. LMM and traditional UMA, under different types of covariance structures, was illustrated. All the methods were also employed to analyze two real data examples.
Results
Based on the simulation and example results, it was found that the SMA completely dominated the traditional UMA and performed convincingly close to the bestfitting LMM in testing all the effects. However, the LMM was not often robust and led to nonsensible results when the covariance structure for errors was misspecified. The results emphasized discarding the UMA which often yielded extremely conservative inferences as to such data.
Conclusions
It was shown that summary measure is a simple, safe and powerful approach in which the loss of efficiency compared to the bestfitting LMM was generally negligible. The SMA is recommended as the first choice to reliably analyze the linear trend data with a moderate to large number of measurements and/or small to moderate sample sizes.
Background
In many fields of science, repeated measurements of a response variable are taken on each subject over time to assess the changes in response. The cumbersome aspect in analyzing such data is that there are relationships between the measurements in the subject over time. There are two major policies in terms of overcoming or taking the relationships into account.
First, one can reduce the vector of responses of each subject to a single value by a descriptive statistic and apply standard univariate approaches to test the effects related to the corresponding summary measure. The use of the summary measure approach (SMA) was suggested by Wishart [1] for the first time. Several strategies based on the least squares regression slope and mean of response over time were recommended to evaluate the differences between the groups [26]. Moreover, the utility of Kendall's τ_{b }as a summary measure of withinsubjects trend in psychiatric longitudinal studies, where the key assumptions of parametric methods are not held, was investigated [7,8].
Second, one can use methods which take the covariances between the measurements into account. Two common and traditional approaches for normally distributed responses are repeated measures ANOVA and MANOVA. In order to avoid inflating type I error rate, the denominator degrees of freedom of the F statistics in the repeated measures ANOVA approach should be adjusted under departures from a restrictive assumption on covariance structures, namely sphericity. But there is no obvious advantage in using the adjusted F tests against the multivariate tests, and generally the adjustments should be avoided [9,10]. In contrast, the repeated measures MANOVA approach makes no assumption regarding covariance structure and hence, it is sometimes known as unstructured multivariate approach (UMA). The only key advantage of the repeated measures ANOVA approach over the UMA is that it can still be implemented in the case where the number of measurements is greater than the sample size.
The linear mixed model (LMM) is more advanced and flexible since it allows dealing with subjects which have incomplete measurements and are unequally spaced in the time period. But the performance of the LMM in testing the effects is highly dependent on the choice of appropriate covariance structure for errors [11,12]. On the other hand, the choice of a parsimonious covariance structure in a small sample design can lead to more efficient inferences concerning the fixedeffects parameters. This aspect makes it inconvenient and unreliable, especially for those who are not familiar with the fundamental principles of mixed models.
Although SMA is a simple, robust and sometimes only applicable tool for the analysis of repeated measures studies, there exists no obvious performance comparison on using the SMA vs. other competitors. Moreover, the application of the SMA has been mostly based on using one summary statistic to assess only the total group difference.
The present study includes repeated measures data in which the pattern of the response profile can be described by a linear trend and the responses measured in a continuous scale. The main objectives of this study are:
a) To describe techniques to test time (withinsubjects), group (betweensubjects) and group × time interaction effects on the basis of two common summary measures, i.e. least square regression slope and mean of response over time.
b) To compare the performance of the SMA, LMM and UMA in the analysis of simulated data from a LMM framework under different types of covariance structures. The approach is also illustrated and compared with the competitors using two real data sets.
In our simulations, there is a focus on situations where the LMM may provide extremely unsatisfactory performance such as misspecification of the covariance structure for errors, small and moderate sample sizes, and relatively a large number of measurements.
Methods
Unstructured multivariate approach (UMA)
The UMA handles the measurements in the subject as a vector of multivariate responses and treats time points as levels of a qualitative factor with no order. This approach is restricted in equally spaced time points, balanced data with complete measurements and also assumes the homogeneity of covariance matrices in all the k groups.
Let denote the vector of m responses from the ith subject in group h for i = 1,...,n_{h}, h = 1,...,k. It is assumed that the response vectors, Y_{ih}, are independent and have multivariate normal distribution with mean and common covariance matrix Σ. The total mean vector is also defined as . If there is no additional covariate, one can use a profile model as
where the vector is the vector of error for the ith subject in group h.
The primary hypothesis interest in a profile analysis is the parallelism of the k groups' profiles or no group × time interaction effect. The hypothesis can be constructed as H_{0}: C μ_{1 }= ... = C μ_{k }for an appropriate transformation matrix C with rank m1. If the test of interaction is not significant, the tests of the main effects are not confounded. In order to compute any MANOVAtype test statistics such as Wilk's lambda (Λ), the condition Nk >m1 is necessary, where N is the total number of subjects. Otherwise, the estimated covariance matrix of the transformed responses would not be nonsingular and positivedefinite. To test time effect, one can investigate the equality of the m elements of the total mean vector () using onesample Hotelling's T^{2 }test on the m1 differences between adjacent measurements from each subject. Here, the same strategy as the SMA is utilized to test group effect, as it is often more efficient than MANOVAtype tests to compare the groups' mean vectors.
Linear mixed model (LMM)
Let denote the m_{i }× 1 vector of responses from the ith subject for i = 1,...,N, where N is the total number of subjects. In contrast to the UMA, the subjects may have different measuring time points and be unbalanced in terms of the number of measurements. The general form of the LMM is
where is an m_{i }× p fixedeffects design matrix for the ith subject, β is a p × 1 vector of fixedeffects parameters for the population, b_{i }is a q × 1 vector of random effects for the ith subject, is an m_{i }× q randomeffects design matrix for the ith subject with q ≤ p, and ε_{i }is an m_{i }× 1 vector of withinsubject errors. The randomeffects vectors, b_{i}, are assumed to be independent and to have a multivariate normal distribution with mean zero and covariance matrix G_{i}, and the error vectors, ε_{i}, are assumed to be independent and to have a multivariate normal distribution with mean zero and covariance matrix R_{i}. In addition, it is also assumed that b_{i }and ε_{i }are independent of one another. The LMM defines the covariances of the measurements in the subject by the covariances of the random effects (G_{i}) and the covariances of the errors (R_{i}). We used the estimators based on the restricted maximum likelihood (REML) method to construct the F statistics of the hypotheses since, in general, it yields less biased estimates of the variance components than those of maximum likelihood (ML) approach and avoids inflating type I error rates [12,13].
The summary measure approach (SMA)
In this section, we describe how to apply the least squares regression slope and mean of response over time for each subject to test the effects of time, group and group × time interaction in repeated measures studies.
The slope of least squares regression line was applied to summarize the relationship between response and time for each subject or withinsubjects effect. If the pattern of individual profiles is linear or at least monotonic, the slopes can appropriately summarize the rate of change of response over time in the subjects. For repeated measures designs, the primary hypothesis is to test whether the pattern of change over time is the same across the k groups or no group × time interaction effect. Under the assumption of no interaction effect, the slopes in the k groups should not be significantly different. For this purpose, once the slopes are obtained for each subject, the ordinary k sample tests such as oneway ANOVA F or KruskalWallis (for k > 2) and Student's t or WilcoxonMannWhitney (for k = 2) can be employed to assess the equality of the slopes in the groups. If the test of interaction is not significant, one would be interested in assessing the main effects.
The hypothesis of no time (withinsubjects) effect states that all the m elements of the total mean vector () are identical. Under this assumption, the overall mean of the slopes in the population must be zero. To test this hypothesis, onesample t test can be applied to the sample slopes to assess the departure of mean slopes from zero.
For testing group (betweensubjects) effect, the mean of measurements over time for each subject is used as a summary measure. By analogy with the interaction effect case, the ordinary k sample tests are applied, but this time, to assess the equality of the individual means in the groups.
Permutation procedure can also be employed to assess the interaction and group effects where the constructive assumptions of the standard tests are not held or cannot be reasonably checked due to small sample sizes in the groups.
Simulation study
For the purpose of data simulation, a simple linear trend mixed model with a random coefficient only for the intercept and a two category grouping variable was considered. The model can be expressed as
where Y_{ij }is the jth measurement from ith subject and X_{i }is a grouping variable with the values 0 and 1, for i = 1,...,N and j = 1,...,m_{i}.
Linear trend mixed model data was generated based on the model (3) with the same measuring time points t_{ij }= t_{j }= 2j for all the subjects, m_{i }= m = 5, 10, and 20 measurements and β_{0 }= 2, in which the random effects, b_{0i}, were assumed to be independently normally distributed with mean zero and standard deviation 0.25.
Since hypothesis testing effects related to withinsubjects effect is highly dependent on the number of measurements, the values of β_{2 }and β_{3 }are adjusted with respect to the m values. Different combinations of β_{1}, β_{2 }and β_{3 }were constructed to compute the empirical type I error rates and powers for testing the three effects.
We considered the following three covariance structures for errors to generate artificial data and fit the LMMs:
• Simple or independent (IND): R_{i }= σ^{2}I, where I is an m × m identity matrix.
• Firstorder autoregressive (AR1) with ρ = 0.7: R_{i }= σ^{2}H, where H = [h_{jj'}] is an m × m matrix with h_{jj' }= ρ^{jj' }for all j and j'.
• Unstructured (UNS): R_{i }= [r_{jj'}] is an m × m covariance matrix with arbitrary structure.
For simplicity, we defined the true structures as those which were used to generate data and the working structures as those which were used to fit the model. In all the cases, it was assumed that the errors were normally distributed with zero mean and in the cases of IND and AR1, the error variances were fixed over time and equal to σ^{2 }= 0.5.
1000 sample data sets were generated for n_{1 }= n_{2 }= n_{3 }= 5, 10, 30 and 50 subjects under various choices of the above circumstances.
We have used free statistical software environment R to generate the artificial datasets and fit all of the approaches presented in the method section.
Results
Simulation results
Withinsubject (time) and withinbybetweensubjects (interaction) effects
Tables 1 and 2 display the empirical type I error rates and powers of the tests of time and interaction effects for various covariance structures, respectively. The first rows in each part, where β_{1 }= β_{2 }= 0 (β_{3 }= 0), display the empirical type I error rates and the rows corresponding to β_{1 }> 0 and β_{2 }> 0 (β_{3 }> 0) show the empirical powers in testing time (interaction) effect. Because of the similarities between the results of testing time and interaction effects, we combined the results in this section in which the following report is right for both effects.
Table 1. Type I error rates and powers for testing withinsubjects (time) effect where rows with β_{1 }= 0 and β_{2 }= 0 give the type I error rates, and the other rows are powers
Table 2. Type I error rates and powers for testing group × time interaction effect where rows with β_{3 }= 0 give the type I error rates, and the other rows are powers
First, the three approaches are compared under the IND and AR1 as true structures. As illustrated in both Tables 1 and 2, empirical type I error rates of the SMA and UMA were always close to (and often smaller than) the nominal significance level (%5). However, the LMM in testing both effects displayed notably larger values for the IND working structure under the AR1 as true structure and more generally for the UNS working structure under the two true structures. Unfortunately, the inflation of type I error rates for the IND working structure under the AR1 true structure tended to be fixed as n increased. As misleading results, because of not preserving the type I error rates, the empirical powers of the LMM in these cases were notably greater than those obtained by the other approaches. In summary, the empirical powers of the SMA were notably greater than those of the UMA and were often close to the corresponding values of the bestfitting LMM. It is also worth mentioning that while the powers of the SMA and LMM tend to be 1 for some larger values of n, the values of the UMA have evident departures from them such as n = 50 with m = 5, 10, 20 and somewhat n = 30 with m = 10, 20.
Next, we consider the simulation results for the UNS as true covariance structure in testing time and interaction effects. The LMMs with the IND and AR1 working structures preserved the type I error rates again. Interestingly, like the IND and AR1 true structures, the empirical type I error rates of the LMM for the UNS working structure were not preserved for smaller and larger values of n and m, respectively. It is worthwhile to note that the empirical type I error rates of the LMM were relatively comparable to the corresponding values of the other approaches only for larger values of n accompanied by smaller values of m; n = 30 and 50 with m = 5, and somewhat, n = 50 with m = 10 in both Tables. However, the empirical powers of the SMA and all the LMMs were similar under such circumstances. Only in the case of m = 5 with n = 30, 50 under the UNS true structure, the UMA was comparable with the SMA in testing both effects.
Betweensubjects (group) effect
The empirical type I error rates and powers of the test of group effect are displayed in Table 3 in which the empirical type I error rates are the values of the first rows where β_{1 }= β_{2 }= 0 and the other rows, where β_{1 }> 0 and β_{2 }> 0, display the empirical powers. Again, it should be noted that both the SMA and UMA use the same strategy to test group effect. Hence, Table 3 only reported the results for the SMA and LMM.
Table 3. Type I error rates and powers for testing betweensubjects (group) effect where rows with β_{1 }= 0 and β_{2 }= 0 give the type I error rates, and the other rows are powers
First, the simulation results for testing the group effect are considered under the IND and AR1 as true covariance structures. Except for the UNS working structure, both SMA (UMA) and LMM often obtained the same empirical type I error rates close to the nominal significance level. Contrary to what we obtained for the two other effects, the LMM preserved the type I error rates for the IND working structure under the AR1 as true structure. The LMM with the UNS working structure tended to have obviously larger empirical type I error rates than the SMA (UMA). The empirical powers of the SMA and LMM were absolutely similar in both IND and AR1 as true structures when the type I error rates were preserved by the LMM.
Now, we consider the results for the UNS as true covariance structure. The LMM with the UNS working structure yielded the preserved type I error rates only for larger values of n accompanied by smaller values of m such as n = 30, 50 with m = 5, 10. However, the LMM preserved type I error rates for the IND and AR1 working structures. In these comparable circumstances, the differences in the powers between the SMA and all the LMMs were negligible.
Illustrative examples
Example 1: Pituitarypteryomaxillary distance data
The first example is a small data set on a facial distance previously published by Potthoff and Roy [14] conducted at the University of North Carolina Dental School. The distance (mm) from the centre of the pituitary gland to the pteryomaxillary fissure was measured at age 8, 10, 12, and 14 in two groups of children (11 girls and 16 boys). The data set has also been analyzed by several analytic methods [12,15].
Figure 1 displays the mean profiles in boys and girls and indicates a departure from the parallelism hypothesis. In general, boys tend to have larger pituitarypteryomaxillary distances and a faster growth rate than girls. In addition, the distances increase over age points in both groups of children.
Figure 1. Mean profiles of pituitarypteryomaxillary distances.
Given model (3), three models were fitted with IND, AR1, and UNS covariance structures. Random intercept and slope models with the three covariance structures were also employed. Table 4 reports the results for the six LMMs, UMA and SMA, as well as Akaike's information criterion (AIC) and Bayesian information criterion (BIC) as two model selection indices for the LMMs. The model with the smallest criterion provides the best fit to data. Based on both foregoing criteria, model 1 with IND covariance structure was preferred. It is worth mentioning that a random intercept model with IND covariance structure for errors yields a compound symmetry covariance structure between the responses.
Table 4. Pituitarypteryomaxillary distances data: summary of test results
All the LMMs and the SMA showed a significant interaction effect at the 5% significance level. These results indicated that the growth pattern in boys was faster than that in girls. Although one could not reject the hypothesis of no interaction effect by the UMA, there was some evidence that the profiles in Figure 1 were not parallel.
All the approaches yielded significant results for the two main effects on the facial growth measurements of children. Based on these results, we accept that boys have larger facial distances than girls and the facial distances increase over age in the two groups of children.
Example 2: Change in lung NO metabolites level data
The second example is an animal experimental study which is about the effects of hypercapnia with or without acidosis on NO production in the isolated ventilatedperfused rabbit lung by assessment of the NO metabolites (nitrite and nitrate) concentration released into the perfusate. The study was conducted at JustusLiebigUniversity, Giessen. The NO metabolites concentration (nmol/min) was measured at time point 0, 5, 10, 15, 30, 45, ..., and 180 minutes in three groups of normoxic normocapnia (NXNC, n = 7), normoxic hypercapnia with acidosis (NXHCA, n = 4) and normoxic hypercapnia with normal pH level (NXHCN, n = 6). Since there were some variations between the baseline measurements, values were given as changes from the baseline. There were six samples (lungs) with incomplete measurements.
Figure 2 displays the mean profiles of change in NO metabolites level data over time for the three groups. The mean profiles increase over time points in all of the groups. However, it is not expected that the patterns of change in NO metabolites level and the overall means will differ between the three conditions.
Figure 2. Mean profiles of change in NO metabolites level.
In this data set, the UMA could not be conducted, because the number of measurements (m = 14) was larger than that of the samples with complete measurements (11 lungs) during the period of study. On the other hand, the UMA and repeated measures ANOVA approaches are not able to handle the experimental units with missing observations.
Table 5 displays the results for the LMM with random intercept, random intercept and slope and also the SMA. Note that AIC prefers random intercept and slope model with UNS, IND and AR1 covariance structures, models 6, 4 and 5, respectively, whereas random intercept and slope model with IND and AR1 covariance structures are to be preferred based on BIC, models 4 and 5, respectively. The reason is that a heavier penalty in the calculation of BIC than AIC was imposed when the number of parameters in the model increased. Since there were a limited number of lungs and a large number of measurements, the danger of overfitting increases. In these cases, it is more reasonable to rely on BIC to select the best parsimonious model. Note that model 6 has larger parameters (d = 112) than model 4 (d = 8) which must be estimated.
Table 5. Change in NO metabolites level data: summary of test results
Based on the results of the LMMs 4 and 5 selected on the basis of BIC, and also the SMA, one can accept that the rates of NO metabolites change in the three groups do not differ. Although this result coincides with that obtained by the most complicated model 6, the unsuitable models 1 and 3 reject the hypothesis of no interaction effect which is not illustrated in Figure 2.
All the LMMs, as well as the SMA, confirmed the effect of time on increasing the mean change over time in all of the groups. Except for the unreasonable model 6, all the models and the SMA confirmed that the mean change profiles for the three groups were the same throughout the time points; therefore, there was no significant group effect.
Discussion
Based on the simulation and example results, it was found that obtaining accurate inferences in a LMM requires heavy statistical knowledge on the true and working covariance structures. However, due to developments in computer sciences, using mixed models is nowadays widespread in experimental designs and clinical trial studies where the sample sizes are not sufficiently large and/or sometimes the number of measurements is large. This serious aspect has previously been reported in a simulation study by Park [12] somewhat in a different way, where there was no random effect in the process of data generating. The interested reader is referred to [1619] for the sample size and power calculations in repeated measurements analysis.
Interestingly, the SMA was robust to the true covariance structures in testing main and interaction effects even for small sample sizes and large number of measurements. Moreover, the SMA in the analysis of linear trend data was a powerful method in which its empirical powers were convincingly close to those of the bestfitting LMM, in general. This means that the least squares slope and mean of response are appropriate measures to summarize the corresponding effects.
In this study, we fitted the LMMs using the ''nlme'' package in the software R in which it follows the innerouter approach for calculating the denominator degrees of freedom (df) of F statistics [20]. In comparison with the packages nlme and lme4 in R, the MIXED procedure in SAS provides also Satterthwaite and KenwardRoger approximation methods for calculating the denominator df which especially result in some improvements in the resulting pvalues. Although the superiority of these complex methods in terms of better preservation of type I error rates has been previously illustrated in unbalance designs [2123], the differences are rather negligible when LMMs are employed inside the context of longitudinal analyses and there is no missing data. The R packages do have the advantage over the SAS procedure in providing the useful alternative algorithms Monte Carlo simulation and parametric bootstrap for getting more sensible pvalues and confidence intervals. However, they are computationally intensive to be included in a simulation study.
The SMA clearly dominated the traditional UMA in testing time and interaction effects. The reason is that the SMA utilizes the linear trend in such data by computing the least squares slopes. However, the UMA assumes a more general nonlinear model with more parameters which must be estimated, and also imposes the most complex structure on the covariances of errors in which it may not be necessary.
Though not reported here some simulations based on the nonnormal data show that, in general, the approaches were relatively robust to departures from multivariate normality. However, this had been reported previously for the twosample Hotelling's T^{2 }test [24,25] and somewhat LME models [26,27].
This paper did not aim to deal with missing observations and baseline or pretreatment measurement techniques. If the missing observations do not occur completely at random, it can introduce potential bias into parameter estimation and decisionmaking in statistical models. Barton and Cramer [28] and Catellier and Muller [29] have proposed several approximating denominator df on this issue. In this respect, the performance of SMA is highly dependent on weighting the individual's summary statistics [30] which may be cumbersome in practice. There are also more complex and efficient approaches to adjust the effect of baseline value (values) for the SMA such as including the baseline (average of baselines) or estimated intercept as covariate in an analysis of covariance (ANCOVA) model [4].
Conclusions
It was shown that the SMA, on the basis of the two summary measures, was a simple, safe and powerful method in testing main and interaction effects in which it performed reasonably as the bestfitting LMM. However, The LMM often led to seriously inflated type I error rates and hence nonsensible inferences when the covariance structure for errors is misspecified. Moreover, this simple approach dominated the widely used UMA in assessing the linear trend data from a mixed model framework. The SMA is recommended as the first choice to confidently analyze linear trend data with a moderate to large number of measurements and/or small to moderate sample sizes.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
MV recommended the basics of the paper, carried out the simulations and prepared the majority of the manuscript. SMTA reviewed the literatures, guided the simulations and helped to draft the manuscript. MT participated in describing the simulation results and helped to draft the manuscript. FK provided the data of example 2, and participated in analyzing the dataset. All authors approved the final version of manuscript and did not have any competing interests.
Acknowledgements
The authors would like to thank Dr. N. Weissmann, Excellence cluster CardioPulmonary System (ECCPS), JustusLiebigUniversity, Giessen for the opportunity to carry out the isolated lung experiments in his facility (Example2 data set). We would like to thank Dr Nasrin Shokrpour at Center for Development of Clinical Research of Namazee Hospital and also Mrs. Sheryl Nikpoor, a Native American living in Shiraz, for editorial assistance.
References

Wishart J: Growthrate determination in nutrition studies with the bacon pig, and their analysis.

Everitt B: The analysis of repeated measures: a practical review with examples.
The Statistician 1995, 44:113135. Publisher Full Text

Frison L, Pocock S: Repeated measures in clinical trials: analysis using mean summary statistics and its implications for design.
Stat Med 1992, 11:16851704. PubMed Abstract  Publisher Full Text

Frison L, Pocock S: Linearly divergent treatment effects in clinical trials with repeated measures: efficient analysis using summary statistics.
Stat Med 1997, 16:28552872. PubMed Abstract  Publisher Full Text

Matthews JNS, Altman DG, Campbell MJ, Royston P: Analysis of serial measurements in medical research.
Brit Med J 1990, 300:230235. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Senn S, Stevens L, Chaturvedi N: Repeated measures in clinical trials: simple strategies for analysis using summary measures.
Stat Med 2000, 19:861877. PubMed Abstract  Publisher Full Text

Arndt S, Turvey C, Coryell W, Dawson J, Leon A, Akiskal H: Charting patients' course: a comparison of statistics used to summarize patient course in longitudinal and repeated measures studies.
J Psychiat Res 2000, 34:105113. PubMed Abstract  Publisher Full Text

McMahon R, Arndt S, Conley R: More powerful two sample tests for differences in repeated measures of adverse effects in psychiatric trials when only some patients may be at risk.
Stat Med 2005, 24:1121. PubMed Abstract  Publisher Full Text

Boik R: Scheffés mixed model for multivariate repeated measures: a relative efficiency evaluation.
Commun StatTheor M 1991, 20:12331255. Publisher Full Text

Davidson M: Univariate versus multivariate tests in repeatedmeasures experiments.

Littell RC, Pendergast J, Natarajan R: Tutorial in biostatistics: Modelling covariance structure in the analysis of repeated measures data.

Park T, Park J, Davis C: Effects of covariance model assumptions on hypothesis tests for repeated measurements: analysis of ovarian hormone data and pituitary pteryomaxillary distance data.
Stat Med 2001, 20:24412453. PubMed Abstract  Publisher Full Text

Patterson H, Thompson R: Recovery of interblock information when block sizes are unequal.
Biometrika 1971, 58:545. Publisher Full Text

Potthoff R, Roy S: A generalized multivariate analysis of variance model useful especially for growth curve problems.

Davis C: Statistical methods for the analysis of repeated measurements. Springer Verlag; 2002.

Ahn C, Overalland Scott JE: Sample size and power calculations in repeated measurement analysis.
Comput Meth Prog Bio 2001, 64:121124. Publisher Full Text

Johnson J, Muller K, Slaughter J, Gurka M, Gribbin M, Simpson S: POWERLIB: SAS/IML Software for Computing Power in Multivariate Linear Models.

Liu G, Liang KY: Sample size calculations for studies with correlated observations.
Biometrics 1997, 53:937947. PubMed Abstract  Publisher Full Text

Muller KE, LaVange LM, Ramey SL, Ramey CT: Power calculations for general linear multivariate models including repeated measures applications.
J Am Stat Assoc 1992, 87:12091226. Publisher Full Text

Pinheiro J, Bates D: Mixedeffects models in S and SPLUS. Springer Verlag; 2009.

Fai AHT, Cornelius PL: Approximate ftests of multiple degree of freedom hypotheses in generalized least squares analyses of unbalanced splitplot experiments.
J Stat Comput Sim 1996, 54:363378. Publisher Full Text

Kenward M, Roger J: Small sample inference for fixed effects from restricted maximum likelihood.
Biometrics 1997, 53:983997. PubMed Abstract  Publisher Full Text

Schaalje G, McBride J, Fellingham G: Approximations to distributions of test statistics in complex mixed linear models using SAS Proc MIXED.

Everitt BS: A Monte Carlo investigation of the robustness of Hotelling's oneand twosample T 2 tests.
J Am Stat Assoc 1979, 74:4851. Publisher Full Text

Nachtsheim CJ, Johnson ME: A new family of multivariate distributions with applications to Monte Carlo studies.
J Am Stat Assoc 1988, 83:984989. Publisher Full Text

JacqminGadda H, Sibillot S, Proust C, Molina JM, Thiebaut R: Robustness of the linear mixed model to misspecified error distribution.
Comput Stat Data An 2007, 51:51425154. Publisher Full Text

Verbeke G, Lesaffre E: The effect of misspecifying the randomeffects distribution in linear mixed models for longitudinal data.
Comput Stat Data An 1997, 23:541556. Publisher Full Text

Barton C, Cramer E: Hypothesis testing in multivariate linear models with randomly missing data.
Commun StatSimul C 1989, 18:875895. Publisher Full Text

Catellier DJ, Muller KE: Tests for Gaussian repeated measures with missing data in small samples.
Stat Med 2000, 19:11011114. PubMed Abstract  Publisher Full Text

Matthews J: A refinement to the analysis of serial data using summary measures.
Stat Med 1993, 12:2737. PubMed Abstract  Publisher Full Text
Prepublication history
The prepublication history for this paper can be accessed here: