Abstract
Background
Repeated data collection is desirable when monitoring fluctuating conditions. Mobile phones can be used to gather such data from large groups of respondents by sending and receiving frequently repeated short questions and answers as text messages.
The analysis of repeated data involves some challenges. Vital issues to consider are the withinsubject correlation, the between measurement occasion correlation and the presence of missing values.
The overall aim of this commentary is to describe different methods of analyzing repeated data. It is meant to give an overview for the clinical researcher in order for complex outcome measures to be interpreted in a clinically meaningful way.
Methods
A model data set was formed using data from two clinical studies, where patients with low back pain were followed with weekly text messages for 18 weeks. Different research questions and analytic approaches were illustrated and discussed, as well as the handling of missing data. In the applications the weekly outcome “number of days with pain” was analyzed in relation to the patients’ “previous duration of pain” (categorized as more or less than 30 days in the previous year).
Research questions with appropriate analytical methods
1: How many days with pain do patients experience? This question was answered with data summaries.
2: What is the proportion of participants “recovered” at a specific time point? This question was answered using logistic regression analysis.
3: What is the time to recovery? This question was answered using survival analysis, illustrated in KaplanMeier curves, Proportional Hazard regression analyses and spline regression analyses.
4: How is the repeatedly measured data associated with baseline (predictor) variables? This question was answered using generalized Estimating Equations, Poisson regression and Mixed linear models analyses.
5: Are there subgroups of patients with similar courses of pain within the studied population? A visual approach and hierarchical cluster analyses revealed different subgroups using subsets of the model data.
Conclusions
We have illustrated several ways of analysing repeated measures with both traditional analytic approaches using standard statistical packages, as well as recently developed statistical methods that will utilize all the vital features inherent in the data.
Background
Information collected for clinical research is usually gathered from the participants using questionnaires (paper or webbased) or diaries. Diaries are often used when several points of measure are of interest, in studying the progress or the development of a condition over time. In theory, this is an excellent method. However, studies have shown that respondents have a tendency to backfill entries [1], and thus diary measurements rely heavily on memory at the expense of data validity.
The term “Ecological Momentary Assessment” (EMA) has been used to describe the “repeated sampling of subjects’ current behaviors and experiences in real time in subjects’ natural environment” [2]. This was previously exclusive to diaries. Through the use of mobile phones, sending and retrieving information repeatedly and as frequently as requested to large groups of people through text messages is possible. In Sweden, 94% of the population owned a mobile phone in 2008 [3]. Further, most people seem to carry their phone with them at all times, thus making measurements truly ecological, i.e. taking place in the patients’ own environment, which may be important when context is influencing the variable of interest. Further, the measurements are momentary, and the use of mobile phones makes seasonal disruption (such as holidays) of data collection minimal [4].
The “SMSTrack Questionnaire” [5] is a software system utilizing this technology; automatic short text messages are sent to study subjects at any desired frequency. The system has been developed specifically for research, and the responses are immediately recorded in a data sheet which minimizes further data handling and thus risk of error during this process. The data may be accessed by the researcher in real time on the internet, and the system has been shown to be highly financially favourable compared to questionnaires [6]. Further, a previous study showed a high response rate [7] without any optimization measures, and other studies showed exceptionally high response rates with simple interventions such as providing initial information and calling to remind respondents who fail to answer [4,6]. The method has shown to be user friendly and to yield good compliance [4]. It has been used to evaluate the clinical course of low back pain (LBP)[4,68] and to detect sports injuries in children [9]. By means of repeated longitudinal data detailed prospective research regarding course, inception, recovery, exacerbation and periodicity is possible.
The analysis of such repeatedly collected data, regardless of collection tool, presents new challenges. Even when compliance is high, invariably most respondents will have some missing values when measured repeatedly over months or years. Further, as the data produced contain repeated measurements of the same individuals over time, withinsubject correlation must be considered in the analysis. In addition, when considering patients, they will often come from different clinics, introducing one more level into the analysis. Finally, it is desirable to analyze the complex outcome measure so that it can be interpreted in a clinically meaningful way.
The overall aim of this commentary is to describe different methods of analyzing repeated data. An overview of different statistical methods were applied to a model data set based on actual data collected with text messages in two of the referenced studies [4,7]. The results were used as a basis for discussion of the challenges, possibilities and appropriateness of each method.
Methods
Participants and measurements
The data used to make the model data set were aggregated from two clinical studies, one Danish [7] and one Swedish [4]. By using data from two studies, the intent was to add diversity and to increase the number of highly compliant participants. From both cohorts, respondents with very low response rates (those answering one or two of the 18 weeks) were removed before merging the datasets. Both studies examined patients with low back pain (LBP) in chiropractic consultations. The Swedish study was approved by the Ethics committee at the Karolinska Institutet (2007/145831/4) and the Danish study was reviewed by the local ethics committee which stated that it did not need approval. A total of 244 patients were included in the model data set; 49% were male, the mean age was 43.5 years (SD 11.0) and 50% of the sample had had LBP for more than 30 days in the previous year. Throughout this commentary, we will use “previous duration” as the baseline characteristic of interest. Thus, the sample will be stratified into patients with short (< 30 days) and long (>30 days) duration of LBP in the previous year, and comparisons between these two groups will be made.
The patients in both studies were informed about the text message method, the exact wording of the text message question (asking about the number of days with bothersome LBP during the preceding week) as well as the answer options (a number between 0 and 7) at inclusion. In both studies patients were followed for 18 weeks with weekly text message questions.
It should be noted that this model data set represents a clinical situation in which patients initially report relatively high levels of pain. A treatment period follows, in which pain reduction is aimed for. Thus, the model data set is not a surveillance of patients in a steady or normal state of health, in which recurrence or inception of a condition would be of interest.
Analytic approaches
The method of analysis depends mainly on the research question asked. In general terms, SMStrack data can be used as an outcome in variable oriented (group level) prognostic or effect studies. The analysis may also be person oriented [10], searching for subgroups with distinct pain patterns with small withinsubgroup variation and high betweensubgroup variation.
The method of analysis also depends on the data. Some analyses are suitable only for continuous data and not for categorical data. Further, data may have to be converted by a log or square root transformation to fit the assumption of normal distribution of the outcome.
Many methods of analysis assume independence of data and therefore are not suitable for repeated data. When a subject is measured repeatedly, the data from that individual (withinsubject) are bound to show a stronger correlation compared to those between individuals (betweensubject). Failure to account for this covariance may result in inaccurate confidence intervals, that is, the nominal coverage of e.g. 95% will not be accomplished.
In this commentary, clinically relevant research questions are presented with an overview of some of the possible analytic approaches. We have treated the measure “number of pain days” as both a continuous and a count variable to illustrate the different methods of analysis. Each method is then discussed in relation to the results of the different subset analyses. The research questions 1–4 are variableoriented (related to group level analyses), and question number 5 is personoriented.
Missing data
Repeated data collection invariably contains some missing data. Without special precautions many analytical methods for longitudinal data cannot directly handle individuals who have missing data for any of the time points and those individuals are simply excluded from analysis, which is known as listwise or casewise deletion. This might introduce bias when the follow up time of a study is long with many measurements, as most subjects will have some missing data, and missing data most often do not occur at random. Exclusion of individuals will, apart from the risk of bias, also lead to less efficient estimates, e.g. larger confidence intervals.
Imputation of missing data is an alternative to facilitate analysis of repeated measures. One of the earliest imputation techniques for longitudinal data is the “last value carried forward”, where the missing value is simply replaced by the previously recorded value. When studying a clinical course, this would probably be a crude but rather accurate action, as each individual value correlates the strongest with the measurements closest in time. However, over recent decades more elaborate ways of imputation of data have been developed, see Little and Rubin [11] for a review of methods. In particular, multiple imputation methods have been developed.
These methods aim to not only impute values but also to take into account the increased sampling variability due to the imputation of missing data. However, in the examples presented in this commentary, no imputations were done, but the problems of missing longitudinal data deserve special attention. A second approach to handling missing data is direct maximum likelihood (DML), see Enders [12] for a primer on this method and a comparison with multiple imputation.
Without imputation techniques we have handled nonresponding in various ways before analysis. One option was to only include a participant if a certain percentage, e.g. 80%, of the entire number of replies during the followup period was answered. Another alternative was to define a certain period of particular clinical importance where replies must be given in order to avoid exclusion. In this commentary, the results from the following alternatives are reported; a) the full data set, b) respondents with a minimum of 80% response rate and c) those answering all the first eight weeks, respectively. The latter was chosen as these patients were included in the studies while experiencing an episode of LBP, and thus the eight first weeks of the trial were chosen as the period of the most interesting development.
Research questions with appropriate analytical methods, results and discussions
We have raised some research questions that are relevant to researchers in the area of musculoskeletal pain and illustrated appropriate analytical methods with which to answer them. We have started with the very basic, descriptive analyses. Then, specific time points were selected for analyses. As a third step, a specific event was selected but time was allowed to vary, and in the fourth approach, both time and event may vary. Finally, the existence of subgroups was the basis for the last set of analyses.
How many days with pain do patients experience?
A: What is the total number of pain days?Table 1
Table 1. DESCRIPTIVE measures, variable oriented hypotheses
It has been suggested that the total number of days with back pain over a period would be a good way of assessing chronicity [13]. With weekly data collected prospectively, accurate data summaries are now feasible. In the model sample, the total number of days with LBP ranged from zero to 126, with a mean of 33.0 days. When analyzing the high compliers only, the mean was slightly higher.
B: What is the average number of pain days per week? Table 1
The individual means were calculated as sum of reported pain days divided by the number of weeks the individual actually reported data. When analyzing the subset of high compliers, the estimate of the group with long duration changed marginally.
Comments
Generally, summary outcomes are easy to interpret and give clinically meaningful estimates, and summarizing weekly data may indeed distinguish patients with a more persistent course. Summary scores could also be used as an outcome in multivariate models. On the other hand, the simplification of data that is the core feature of the summary statistics ignores course patterns and time to improvement. Potentially important differences could be missed e.g. would the sum score during ten weeks be equal for a patient having pain level 3 during all weeks and a patient with five weeks of “pain 6” followed by five weeks of “pain 0”.
What is the proportion of participants “recovered” at a specific time point?
A: Incidence of recovery week by week. Table 1& Figure 1
Figure 1. Percentage of patients recovered (LBP days = 0) and unrecovered in each of 18 weeks following a first visit to a chiropractor (n = 212 week one; n = 186 week 18).
To describe the betweenindividual variance of the population, one possibility is to present the proportion that meet a criterion of interest at different time points. In our studies, recovery at different time points was studied. Thus, a dichotomised outcome “recovered “(defined as reporting zero days with LBP [14]) and “not recovered” was used for each week throughout the study. Comparisons of the proportion of recovered between groups can be done as in section B below, but this analysis should be applied only to a prespecified time point of interest to avoid the problem of masssignificance due to multiple testing [15]. As can be seen in Figure 1, the proportion of respondents who recover increased up until week 8 to about 50% of the population, and remained more or less stable thereafter. For patients with < 30 and > 30 days LBP last year, 58% and 31% of the populations respectively had recovered after eight weeks (results not shown). The high compliers had a recovery pattern very similar to that of the entire population (results not shown).
B: Incidence of recovery at a prespecified time point, Table 1
Repeatedly collected data can also be used in ordinary logistic regressions choosing a specific time point as the time of interest (to patients, clinicians or to third party payers), i.e. not selecting time points on the basis of Figure 1 above. According to previous studies of patients with LBP in primary care, many patients report improvement by the fourth to fifth week after consultation [16]. Recovery (this time reporting zero or one day with pain [17]) was the event and five weeks was chosen as our time point of interest. The proportion of patients recovered at this point in time was compared for those with long and short previous duration. Note that this approach could easily accommodate more predictive variables. The result from the logistic regression analysis showed that the odds ratio for recovery at week 5 was significantly higher for patients with a short previous duration compared to those with a long previous duration (OR = 3.71, 95% CI: 2.16.6). Another outcome parameter is the risk ratio, RR, (using a logarithmic link function instead of the logit function in logistic regression) of 1.75(CI: 1.42.3). When analyzing the high compliers only, the estimates changed marginally.
Comments
Defining recovery as the event of interest, maybe “relief rates” would be a better term than hazard rates, as the outcome parameter. Using this outcome, the numbers and proportions of patients with a successful course can be estimated, and also the numbers needed to treat can be calculated. However, individual trajectories are ignored at the expense of population overview.
What is the time to recovery?
A: Incidence of “recovery” throughout the study period, Table 2& Figure 2
Table 2. INCIDENCE measures, variable oriented hypotheses
Figure 2. Incidence of “recovery” (reporting zero or one pain days in two consecutive weeks), stratified by previous duration (> 30 days in pink, < 30 days in blue) for the full data set.
Using repeated measures, it is possible to monitor a population at risk to study the incidence of an event over time, that is, an extension of the approach in section 2 above. To our knowledge, this type of data has thus far been unobtainable and provides a “true” incidence of the event studied. When exploring different conditions/variables, the event will be defined according to clinical/medical or other parameters.
In survival analysis, one of the main application areas for this kind of analysis, mortality was the event under study. As we would expect the patients to get better, we studied the positive event “recovery”, for this example data set defined as zero or one pain day reported in two consecutive weeks [18]. Our illustrated analysis can be described in general terms as a timetoevent analysis, using techniques and notions that emanate from survival analysis [19]. KaplanMeier curves are one of the basic descriptive tools to summarize outcome over time. Figure 2 displays the number of reported events in a Kaplan Meier curve, i.e. the incidence of patients recovering. The logrank test for differences in the rate of recovery between the long and short duration groups was significant (p < 0.001). The same analysis was performed for the high compliers (curve not shown). Again, the logrank test showed significant differences in the rate of recovery between the two groups (p = 0.002).
B: Incidence of “recovery” in relation to predictive variables, Table 2
KaplanMeier curves stratified by a predictor variable and a test for difference between categories of the predictor were the starting point for this analysis. More elaborate analysis allowing for several predictor variables can be done with Cox Proportional Hazard regression [19] with time (week)treated as a continuous variable. In our data set, it could be hypothesized that the risk of future LBP would be predicted by past LBP. If the risk of the condition studied changes between groups of subjects over time, the assumption of proportional hazards might be violated and should be adjusted for. When the outcome is measured at discrete time points (e.g. only a few weeks observed), a discrete hazard regression analysis [19] could be more appropriate than the Cox model. In our model data, a test of the proportional hazard assumption showed that the assumption was not rejected. The hazard ratio (HR = 1.95) showed significant differences (95% CI: 1.4 2.6) between the two groups, meaning that the group with a short previous duration had nearly double the “risk” of the studied event recovery. Performing the analysis for the high compliers only did not change the estimate.
When the outcome event was considered to be measured at discrete time points and no assumptions were made of proportional hazards, a discrete hazard regression analysis showed a statistically significant hazard ratio of 2.03 (95% CI: 1.52.7) between the groups, again pointing towards the patients with a shorter previous duration of LBP having the best chance of recovery during the study period. Performing the analysis with the high compliers only did not change the estimate.
C: When does a particular event occur at an individual level? Table 2
Having access to repeatedly observed data, detailed changes in the course of the condition can be studied. Concerning LBP, it is often observed that the patient’s course is different in a first phase up to approximately the fourth or fifth week with a rapid recovery [7,16], and then a much slower improvement is observed [20]. The shift in the course may be estimated more precisely by applying a spline regression technique where two regression lines are fitted to describe the two sections of the course and the intersection (“knot”) between these regression lines estimates the shift in the course. This application of the spline regression technique is also known as a piecewise linear regression. The spline regression can be applied for the whole group, for predefined subgroups of interest such as those with long or short previous durations, and also for each individual separately provided that the number of weekly observations is sufficient (as described in section 5B below). In the model data set, the regression lines for short and for long previous duration were clearly separated, with statistically significant differences in three of the four parameters that define the spline regression lines. The patients with a short previous duration had a course change at 4.5 weeks, compared to the patients with a long previous duration, who had a later course change at 5.9 weeks. Analyzing the high compliers separately only marginally changed the estimates.
Comments
Looking at incidences by means of hazard ratios yields interpretable results in terms of proportions recovered and holds the possibilities of the analytic methods used for survival analyses [19]. Recovery from LBP may be defined more or less stringently concerning pain days and period. In a condition like LBP it is, however, problematic to define recovery as the “event” since patients often experience fluctuations in the condition or recurrences following a pain free interval [21]. The possibility of repeated or recurrent events is not dealt with in the standard implementation of survival analyses.
Further, the definition of event will influence the KaplanMeier curve. Had we defined recovery as four consecutive weeks with little or no pain (instead of two weeks), a smaller number of patients would have accomplished this and thus the curves would have been more horizontally oriented.
Throughout, regardless of the outcome parameters, the type of data (continuous or count) and the level of compliance, the group with a short previous duration of LBP had significantly higher “risks” (chance) of recovery. This suggests that for our model data set, the methods were robust as the conclusions pointed in the same direction regardless of method. Further, incidences over the full study period as well as at a specific point in time can be calculated from the full data set, as including poor compliers only marginally changed the estimates. In other data sets, this may, however, not be the case. In the mentioned Swedish study [4], poor compliers were found to have a less positive development of their LBP over time, which is why we suggested that the effect of high compliance should be evaluated.
Finally, we also illuminated a method to estimate the point of change in a course of pain using spline regression analysis. Again, a difference between the two groups was noted, and the use of only high compliers did not change the estimates to any large degree. Spline regression is, of course, an approximation of the true, rather fluctuating, course of pain during the 18 weeks. The specification of the spline regression has, however, been done with those few parameters that were of vital clinical interest. To capture all the features of the course of pain would surely need several more parameters and such an approach would probably lose clinical interpretability.
How is the repeatedly measured data associated with baseline (predictor) variables?
Variation in events over the whole time period. Table 3
Table 3. LINEAR MODELS, variable oriented hypotheses
In these examples, we examined the association of the baseline variable “previous duration” with the outcome “number of pain days”. Throughout, the effect of time is considered a fixed effect (to account for systematic differences between weeks and to obtain estimates for each separate week). All models in this section are statistically and computationally more advanced than those in the previous sections. This whole area of statistical models has expanded very much during the last 10–15 years thanks to theoretical advancements as well as the development of suitable software. It is outside the scope of this article to give details here, we recommend texts such as those by Twisk [22] and RabeHesketh [23].
A: This approach used either a multilevel mixedeffects logistic regression or a Generalized Estimating Equation (GEE)[24] to obtain an effect parameter as a subjectspecific Odds Ratio (OR), or a population average OR, the former from the multilevel model, the latter from the GEE analysis. Both models are extensions of linear models to accommodate repeated data. The distribution of the outcome was assumed to follow a binomial distribution and a logit link function described the relation between the outcome and the predictor or baseline variables. The argument that the number of days will follow a binomial distribution was that the respondents evaluated a yes/no reply for each day when answering the weekly measure, really asking themselves “Did I have pain on Monday? On Tuesday?” etc. before summing up. Modeling the covariance structure of the correlated repeated measurements, the associations of the baseline variables as well as possible interactions of such variables with the outcome can be studied in several different ways. The outcome, number of pain days, was considered a count variable. The patients reporting short duration pain the previous year had significantly less odds of reporting pain during the study period. Calculating the subject specific odds ratio (OR = 3.31, 95% CI: 2.15.1) indicated the odds for a single subject, whereas the population average (OR = 1.95, 95% CI: 1.42.6) gave an estimate for the “averaged” subject closer to unity than the subject specific odds ratio as expected when the two approaches of estimation were considered. Performing the analysis for the high compliers only lowered the estimates somewhat for both subjects and population.
B: As above, the outcome was considered a count variable, but a Poisson distribution was assumed instead of a binomial distribution. The multilevel analysis showed a significant difference (Incidence Rate Ratio, IRR = 1.92, 95% CI 1.52.4) between the patients with a short previous duration of pain compared to those with a long previous duration, indicating that the former had lower odds of reporting many pain days. The estimate was lowered somewhat when analyzing the high compliers only. In this case, a multilevel Poisson regression [18] was appropriate. Our outcome “number of pain days” may not seem to be an obvious candidate for a Poisson distribution, because of the upper limit of seven days each week since a restriction of this kind is not appropriate for a Poisson distribution. We have added the Poisson analysis just to show the appropriate method for another type of outcome such as “number of times of taking pain medication” during the week, which in theory may have no or at least a very high upper limit.
The outcome parameter is an Incidence Rate Ratio (IRR). A property of the Poisson distribution is that the mean and the variance are equal. In some applications this may not be the case, and in particular the variance can be greater than the mean, which is referred to as a case of over dispersion. Then an analysis using a negative binomial distribution may be appropriate (not described here, see [24]).
C: Considering “days with pain,” a continuous outcome may not be the obvious choice either for our model data set, as the outcome variable “number of pain days” was discrete and had an upper limit of 7. In the investigation of other conditions, the outcome could be continuous. If so, the association with baseline variables can be studied using mixed linear models. In our example, an autoregressive covariance model was chosen assuming decreasing correlation with increasing time and confirmed with Akaike’s Information Criterion [25], and the associations of baseline variables as well as the interactions of these variables with the outcome were studied. It should be noted that the other available baseline variables besides previous duration were included in the model as well (not presented here). In mixed linear models, with time as the explanatory factor, the association of previous duration and weekly pain days showed a significant difference for patients reporting long and short previous durations (average difference in pain days 1.20 (95% CI: 0.8 – 1.5)).
Performing the analysis for the high compliers lowered the estimate, but did not change the significance. Including more baseline variables in the model did not affect the parameter estimates noticeably (analysis not shown).
Comments
These methods are designed for repeated measures and take correlation between outcomes measures and different time points into account. In this way the richness of the data is maintained and trustworthy significance levels are achieved. However, the results may sometimes be more difficult to interpret in a clinically meaningful way. The models are statistically more sophisticated, require more work for the specification of the analyses, but they are implemented in standard statistical software (such as SPSS, STATA, SAS) and our analyses and results are based on these softwares.
With GLM it is possible, as demonstrated, to use both count and continuous variables, and the outcome could be binomially, Poisson, or normally distributed. In our model data, regardless of model, the results all pointed in the same direction. We conclude that for our chosen variable and outcome, the choice of method may not have been utterly important. However, because our repeated outcome was most accurately classified as a count variable following a binomial distribution, we trusted the estimates from the generalized estimating equation model and the corresponding multilevel model to be the most valid estimates.
A final note has to be added to the results of this section about the significant interaction found between previous duration and week, the latter variable representing time under observation.
This implied that a second step in the analysis was necessary to fully understand how the pain course developed over time for the two duration groups. In the present context with examples and suggestions for analysis this step is not further elaborated.
Are there subgroups of patients with similar courses of pain within the studied population?
There are several methods that are useful when looking for patterns within repeated data using personoriented approaches. The examples below range from purely descriptive (A) which rely on a clinical impression, through hierarchical methods (B) which are mathematical in origin but requires a supplementary clinical judgment, to the very mathematical methods (C) which rely on the acceptance of a prespecified statistical model.
A: Visual description, Tables 4 & 5
Table 4. SUBGROUPS, personbased hypotheses
Table 5. Categories used to classify individual pain patterns by visual analysis
The course of a disease (LBP in our model sample) can be described by a graphical representation of each participant’s development over time, in this case the number of pain days by weeks. In studies with relatively few individuals (up to about 200 participants) it is possible to analyze such courses by hand by describing the characteristics of the pain courses including when the changes take place; thus attempting to identify groups of patients with similar pain patterns. The interobserver reliability of such a visual evaluation was found to be substantial (kappa = 0.7) between two observers analyzing 78 courses [7]. Depending on the level of detail in the visual description this may result in very few subgroups or in as many subgroups as the number of individuals studied. Therefore, we suggest making clinically relevant apriori definitions of the subgroups to be looked for in order to improve the objectivity and reliability of the visual descriptions. The method relevant for the categorization of course patterns will depend on the known clinical characteristics of the investigated disease and the type of population. Our approach to visually describing the pain patterns was performed in two stages as described earlier [7]. In short: The development of an early course (weeks 1 – 4) and a later course (week 5 and later) was defined prehoc. It may be relevant to split other data sets differently depending on the condition studied and the length of the followup. Visually described patterns could be derived from the text message responses of 215 patients. The analyses was not performed if data are missing for more than two weeks in a row, in which case the subject had to be excluded.
The combination of the categories for the early and the late course resulted in 13 possible categories. All the 13 categories were represented in the model data set, consisting of from one (0.5%) to 64 (30%) patients (Table 5). Patients described as recovered had the fewest LBP days and patients classified as “worse” in the late course had the most LBP days. Further, the patterns with recovery or lasting improvement had a larger proportion of patients who had < 30 LBP days the previous year than patterns of fluctuation or worsening (Table 5).
B: Cluster analysis using an hierarchical method
The trend course of the outcome variable over the study period can be explored with cluster analysis to search for subgroups. The simplest alternative would be to use all individual 18 weekly data points as cluster parameters, but this is practically difficult both with respect to the burden of heavy computations and of missing data. Therefore, the individual courses in our model sample were summarized with different mathematical approaches, limiting the number of cluster parameters. The parameters were then used in a hierarchical cluster analysis, Ward’s method [26], to reveal subgroups. Ward’s method resulted in a dendrogram (Figure 3), a graphical representation of the cluster building process, which was then scrutinized visually to find the optimum number of clusters, together with a mathematical criteria such as Caliński Harabasz [27] or Duda Hart [28]. Then, the two best solutions suggested by the Caliński Harabasz criterion were explored in terms of detail vs. overview to find the optimal final solution. Thus, in approach a), the Caliński Harabasz criterion suggested that the fourcluster solution was optimal. For the method applied in b), the Caliński Harabasz criterion suggested a sixcluster solution.
Figure 3. A dendrogram obtained with Ward’s method, describing the formation of clusters.
a) Each course was described by two regression lines describing the early trend and the later trend, respectively. Using a spline (nonlinear regression) technique, the intersection between the two was then calculated. From these analyses, four parameters described each profile: the intercept and slope of the early trend, the difference in slope between the early and late trend and the intersection between the two regression lines. These four parameters were then used as cluster parameters. Note that the regression parameters serve to approximate the weekly variation in the data, and they must be evaluated with respect to the degree of fit for each patient.
Cluster 1 (28 individuals): This subgroup contained the oldest individuals (mean age 47 years), who reported the highest number of total pain days (50 days).
Cluster 2 (68 individuals): This subgroup contained the youngest individuals (mean age 42 years), who reported 33 pain days throughout.
Cluster 3 (20 individuals): The patients here were mainly male (70%), reported the most leg pain (40%), reported least pain days (20 days) and most (85%) had had short previous duration of LBP.
Cluster 4 (45 individuals): This subgroup contained the largest proportion of patients (63%) that had had long previous duration of LBP and reported 36 days of pain throughout.
b) Each course was described by the eight first weekly measurements only. As mentioned earlier, this was considered the period where the most change in pain days is likely to be noticed in our population. Thus, in this cluster analysis, eight cluster parameters were used.
Cluster 1 (12 individuals): This subgroup had many women (67%) and reported 37 pain days throughout.
Cluster 2 (24 individuals): This subgroup had the youngest patients (mean age 40 years), mainly women (67%) and reported 35 pain days throughout.
Cluster 3 (15 individuals): This group had mainly patients with long previous duration of pain (80%), and reported 58 days of pain throughout.
Cluster 4 (28 individuals): These patients were mainly male (64%), and most (79%) reported short duration of previous pain. They had 21 pain days throughout.
Cluster 5 (46 individuals): This group reported fewest days of pain, only 13.
Cluster 6 (8 individuals): This was the subgroup with the oldest individuals (mean age 49 years), they reported most leg pain (50%)and had the highest number of pain days, 101 in total.
C: Other exploratory approaches
The cluster analysis shown here is just one of several different alternatives that can be used for this and similar data sets. An excellent overview of cluster methods in theory and in applied research is found in Everitt BS, “Cluster Analysis”[26]. Among these methods are those using finite mixture densities with a range of approaches including mixtures for multivariate normal distributions as well as mixtures for categorical data (latent class analysis) and Bayesian analysis of mixtures.
Comments
Several other possibilities also exist for aggregating data points into useful descriptions of the course suitable for cluster analysis. Adding to the linear regression, a second or third degree regression can be used to also approximate the course. Similarly, the spline function can be extended to contain two or more knots. Clinical judgement should be used to evaluate what is relevant for any particular condition. We have previously argued that to secure solid course estimates, only highly compliant responders (arbitrarily defined as those answering more than 80% of the time) should be used in these kinds of analyses (in a))[20] and extending the argument, only those with a full response when using the crude data as parameters (b).
Further, in both A and B, the subgroups/clusters formed in this mathematical way should be examined for clinical meaningfulness. For instance, the available clinical baseline variables associated with each cluster can be tested for difference between clusters.
We have illustrated possible ways of exploring the model data set for subgroups based on the repeated data. Concerning nonspecific LBP, our approach was based on the hypothesis that patients with the same “category” of LBP might exhibit a similar clinical course. The visual description of individual’s pain patterns was a pragmatic and clinically meaningful way to distinguish between obviously different patient profiles, but the method is time consuming and only doable in small samples. However, the choice of subgrouping methods is, as for all analyses, dependent on which assumptions of the data and their distributions the researcher is willing to accept.
A number of data mining approaches can be used for pattern recognition instead, e.g. cluster analysis, latent class analysis, artificial neural networks, and probabilistic data mining. It is beyond the intents of this commentary to test, describe and compare these.
Conclusions
When new methods of data collection are introduced, it is always pertinent to consider the possibilities, advantages, implications and challenges this entails. Making use of a technology available to a majority of people in the modern world, mobile phones, doors are opened to repeated measurements from large populations.
Having access to repeated data, it is not selfevident what methods of analysis to use. We have intended to give an overview of some approaches to analyses considered by our group of researchers, but other relevant methods exist and, possibly, different data from other research areas may require yet different methods. In this commentary, the methods are presented very briefly, and we encourage readers to use the references included for a deeper understanding.
Ultimately, the choice of analytic approach will depend on the following questions: what is the research question to be answered, what kind of data is the outcome variable, what distribution does it have and what is the withinsubject correlation? The answers will determine the most appropriate method of analysis.
In analyzing repeated data, the issue of withinsubject correlation can be avoided by aggregating the individual measures into a summary measure prior to analysis. However, information on individual variation is then lost, resulting in oversimplification. Thus, methods of analysis that account for this covariance may be the most appropriate.
In our model data, patients with > 30 days LBP the preceding year consistently demonstrated an increased risk of a “poor prognosis” compared to those with < 30 pain days in all the variable oriented analyses. Further, different patient profiles could be identified based on the pain trajectories emerging from cluster analyses of the frequently repeated outcome measure. Thus, it seems that repeated measures can be analysed in several meaningful ways with both traditional analytic approaches using standard statistical packages, as well as recently developed statistical methods that will utilize all the vital features inherent in the data.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
IA was involved in the design of this commentary, participated in the analyses and wrote the first manuscript draft. LB was responsible for the analyses. AK and NW were involved in the design and analyses of the data. IJ and GB were supervising the study process. All authors were involved in finalizing the manuscript.
Acknowledgements
We would like to thank Professor Charlotte LeboeufYde, who instigated this article. This work was funded in part by the Swedish Chiropractors’ Association and by the European Chiropractors’ Union.
References

Stone AA, Shiffman S, Schwartz JE, Broderick JE, Hufford MR: Patient compliance with paper and electronic diaries.
Control Clin Trials 2003, 24(2):182199. PubMed Abstract  Publisher Full Text

Shiffman S, Stone AA, Hufford MR: Ecological momentary assessment.
Annu Rev Clin Psychol 2008, 4:132. PubMed Abstract  Publisher Full Text

The Swedish population's use of the internet and telephones  an individ survey 2008. 2008. PubMed Abstract  Publisher Full Text

Axen I, Bodin L, Bergstrom G, Halasz L, Lange F, Lovgren PW, Rosenbaum A, LeboeufYde C, Jensen I: The use of weekly text messaging over 6 months was a feasible method for monitoring the clinical course of low back pain in patients seeking chiropractic care.
J Clin Epidemiol 2012, 65(4):454461. PubMed Abstract  Publisher Full Text

Johansen B, Wedderkopp N: Comparison between data obtained through realtime data capture by SMS and a retrospective telephone interview.
Chiropr Osteopat 2010, 18(1):10. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Kongsted A, LeboeufYde C: The Nordic back pain subpopulation program  individual patterns of low back pain established by means of text messaging: a longitudinal pilot study.
Chiropr Osteopat 2009, 17:11. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Jensen RK, LeboeufYde C, Wedderkopp N, Sorensen JS, Manniche C: Rest versus exercise as treatment for patients with low back pain and Modic changes. A randomised controlled clinical trial.
BMC Med 2012, 10(1):22. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Klakk H, Jespersen E, Wedderkopp N: Overweight  a risk factor of overuse injuries in children? the childhood health, activity and motor performance school study  a 3year controlled intervention study.

Bergman LR, Magnusson D: A personoriented approach in research on developmental psychopathology.
Dev Psychopathol 1997, 9(2):291319. PubMed Abstract

Little RJA, Rubin DB: Statistical Analysis with Missing Data. 2nd edition. Wiley, NJ; 2002.

Enders CK: A primer on the Use of Modern MissingData Methods in Psychosomatic Medicine Research.
Psychosom Med 2006, 68:427436. PubMed Abstract  Publisher Full Text

Kent PM, Keating JL: The epidemiology of low back pain in primary care.
Chiropr Osteopat 2005, 13:13. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Reigo T, Tropp H, Timpka T: Clinical findings in a population with back pain. Relation to oneyear outcome and longterm sick leave.
Scand J Prim Health Care 2000, 18(4):208214. PubMed Abstract  Publisher Full Text

Hughes M: Biostatistics in Clinical Trials. Wiley, Chapter on Multiplicity; 2001.

Axen I, Rosenbaum A, Robech R, Wren T, LeboeufYde C: Can patient reactions to the first chiropractic treatment predict early favorable treatment outcome in persistent low back pain?
J Manip Physiol Ther 2002, 25(7):450454. Publisher Full Text

Hancock MJ, Maher CG, Latimer J, Herbert RD, McAuley JH: Can rate of recovery be predicted in patients with acute low back pain? Development of a clinical prediction rule.
Eur J Pain London England 2009, 13(1):5155. Publisher Full Text

McGorry RW, Shaw WS, Lin JH: Correlations between pain and function in a longitudinal low back pain cohort.

Mills M: Introducing Survival and Event History Analysis. Sage, London; 2010.

Axen I, Bodin L, Bergstrom G, Halasz L, Lange F, Lovgren PW, Rosenbaum A, LeboeufYde C, Jensen I: Clustering patients on the basis of their individual course of low back pain over a six month period.
BMC Musculoskelet Disord 2011, 12:99. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

de Vet HC, Heymans MW, Dunn KM, Pope DP, van der Beek AJ, Macfarlane GJ, Bouter LM, Croft PR: Episodes of low back pain: a proposal for uniform definitions to be used in research.
Spine 2002, 27(21):24092416.
Phila Pa 1976
PubMed Abstract  Publisher Full Text 
Twisk J: Applied Multilevel Analysis. Cambridge University Press, Cambridge; 2006.

RabeHesketh S, Skrondal A: Multilevel and Longitudinal Modeling Using Stata. 3rd edition. Stata Press, ; 2012.

RabeHesketh S, Skrondal A: General linear mixed effects models. In Longitudinal Data Analysis: A Handbook of Modern Statistical Methods. Chapman & Hall/CRC, Boca Raton; 2007.

Akaike H: A new look at the statistical model identification.
IEEE Trans Autom Control 1974, 19(6):716723. Publisher Full Text

Everitt BS, Landau S, Leese M: D S: Cluster Analysis. 5th edition. Wiley, London; 2011.

Caliński R, Harabasz J: A dendrite method for cluster analysis.

Duda R, Hart P: Pattern Classification and Scene Analysis. Wiley, New York; 1973.
Prepublication history
The prepublication history for this paper can be accessed here: