Abstract
Background
In the randomized study of interferon beta1b (IFN beta1b) for multiple sclerosis (MS), it has usually been evaluated the simple annual relapse rate as the study endpoint. This study aimed to investigate the performance of various regression models using information regarding the time to each recurrent event and considering the MS specific data generation process, and to estimate the treatment effect of a MS clinical trial data.
Methods
We conducted a simulation study with consideration of the pathological characteristics of MS, and applied alternative efficacy estimation methods to real clinical trial data, including 5 extended Cox regression models for timetoevent analysis, a Poisson regression model and a Poisson regression model with Generalized Estimating Equations (GEE). We adjusted for other important covariates that may have affected the outcome.
Results
We compared the simulation results for each model. The hazard ratios of real data were estimated for each model including the effects of other covariates. The results (hazard ratios of highdose to lowdose) of all models were approximately 0.7 (range, 0.613  0.769), whereas the annual relapse rate ratio was 0.714.
Conclusions
The precision of the treatment estimation was increased by application of the alternative models. This suggests that the use of alternative models that include recurrence event data may provide better analyses.
Background
Multiple sclerosis (MS) is the most common demyelinating disorder of the central nervous system, and is characterized by repeated episodes of neurological dysfunction with variable remission. Since 1993, the beneficial effects of interferon beta have been shown [1], and in Japan, interferon beta1b (IFN beta1b) has significantly reduced relapse rates and reduced MRI lesion areas in patients with relapsingremitting MS [2]. Recently, Kappos et al. [3] reported that IFN beta1b can delay the conversion to clinically definite MS. Carroll [4] performed a comprehensive review of clinical studies of MS therapies.
The longterm treatment effects for chronic recurrent diseases such as MS should be evaluated in clinical trials. In the past, the primary endpoint in clinical trials of MS has been the annual relapse rate, the change in a clinical indicator such as the Expanded Disability Status Scale (EDSS) score or total area of MS lesions on the MRI scan from entry time, the proportion of nonrelapsed patients, or the time to the first recurrence [1,2,510]. Meanwhile, extended methods of survival analysis for timetoevent data have been proposed, and such methods are useful when study subjects experience 2 or more events. Considering the recurrent events in survival analysis should theoretically increase the estimation efficiency regarding the effects of treatment [11]. Although these methods have not generally been applied to MS clinical trial data, Wang et al. [12] recently examined some of the models. Excellent reviews are available regarding how these methods can contribute to the estimation of treatment effects [11,13,14]. When using these models, it is important to pay attention to the nature of the models because the results of the estimation are highly dependent on the clinical situation [15]. In real clinical studies, the concerned events might occur rarely, several events might occur simultaneously, or several events might occur separately with high correlation. Appropriate models should be selected after considering the relationship between the assumptions of the models and the manner in which the events occur. For example, if we analyse the disease data such that the deteriorations of many lesions are found simultaneously, we should select the model that can manage the count data approach rather than the gap time modeling of event history analysis.
In this study, we focused on the extended Cox proportional models and Poisson regression model using Generalized Estimating Equations (GEE), which can be analyzed using existing statistical packages such as SAS. Using these regression models, we can estimate the adjusted treatment effect while considering the important covariates that might affect the outcomes, whereas the relapse rates provide only nonadjusted estimate [16]. The objective of this study was to investigate the performance of these models through a simulation study with MSspecific data generation processes and to apply various models that are used for estimating the treatment effect to a real clinical trial data set. This data set comprises the effect of IFN beta1b on MS with special attention to subjects with relapsingremitting MS.
Methods
Subjects
A phase II randomized controlled clinical trial was conducted to compare the effect of 2 different doses (highdose: 250 μg and lowdose: 50 μg) of IFN beta1b on relapsingremitting MS in Japan. Details of the trial design, inclusion criteria, baseline demographics, and efficacy results have been published [2]. In the trial, 205 patients with relapsingremitting MS were randomized, and efficacy was assessed in 188 patients (55 male and 133 female patients). The primary endpoint of the study was the evaluation of the annual relapse rate. The percentage of patients who experienced a relapse more than once during followup was 55.8% (53/95) of patients in the highdose group and 65.6% (61/93) of patients in the lowdose group. In these groups, the maximum number of relapses was 7, and the minimum, 0, with a median of 1. The annual relapse rates in the high and lowdose groups as estimated by the persontime method were 0.763 and 1.069, respectively (relapse rate ratio = 0.714; 95% CI 0.560 0.910; p = 0.006).
Models
Various survival models used for analysis of recurrent event data and that handle clustered and multiple event data have been proposed. Let λ_{ij}(t) be the hazard function of the jth recurrence of the ith subject at time t; λ_{0 j}(t) be the baseline hazard function of the jth recurrence at time t; Y_{ij}(t) be the indicator variable for the jth recurrence of the ith subject at time t, which is 1 when the subject is at risk and under observation and 0 otherwise; X_{ij}(t) be the jth covariate vector of the ith subject at time t; and βj be the parameter vector for the jth recurrence, which includes the treatment effect. When each recurrence is assumed to have common effect, we omit the subscript j. Schematic forms of the models are shown in Figure 1.
Figure 1. Schematic forms of the various extended timetoevent Cox regression models. Each arrow represents a stratum. Arrow diagrams describe about the behavior of ID = 1 in the sample data (DATA = MS) in Appendix (Additional file 1), who experiences the first event at day 51, second event at day 185, third event at day 413 and finally have censored at day 692. (A) "Timetofirstevent Cox model" only uses the information of the time to first event (the day 51). (B) "AG model" shows the renewal process of the events. (C) "PWP model" shows how the conditional models are constructed. (D) "WLW model" and "LWA model" model the marginal distribution of each event occurrence time.
The first model to be considered is the ordinary timetofirstevent model, which is formulated with the Cox proportional hazard model (hereafter referred to as "timetofirstevent Cox model"). It handles only the timetofirstevent data and ignores the information of the second or more events. Hereafter, the models that can deal with this lack of information are shown.
Andersen and Gill [17] extended the Cox proportional hazard model in the counting process formation (AG model). A Poisson process in which, each counting process has independent increments is assumed so that multiple events within the same subjects are regarded independently. The hazard of subject i at time t is
Although subjects who have once experienced an event are excluded from the risk set from that time in the usual Cox model, subjects who have experienced at least 1 event and are under observation can also be included in the risk set in the AG model. Because the baseline hazard is assumed to be common among subjects, this model ignores the individual differences and might be effective when the overall treatment effects are of interest.
Prentice, Williams, and Peterson [18] also extended the Cox model. They proposed the conditional model, which assumes that a subject is not at risk for the jth event until he/she has experienced the (j1)th event, where Y_{ij}(t) is 0 until the (j1)th event and after which it becomes 1 (PWP model). In terms of the time scale, 2 models are used. One model measures from the entry time and is called the total time model (PWPT model). The hazard of the jth recurrence of subject i at time t is
The other model resets the clock at every recurrence and is called the gap time model (PWPG model). Assigning t_{j1 }as the time at which the (j1)th event occurs, the hazard of the jth recurrence of subject i at time t is
Although the PWP model makes the interpretation easy, the sizes of the risk sets become relatively small as the number of events increases, making the estimates unstable.
Wei, Lin, and Weissfeld [19] modeled the marginal distribution of the time of each occurrence of the event using the Cox model (WLW model). The hazard of the jth recurrence of subject i at time t is
In this model, each recurrence is modeled as a separate stratum, and each subject appears in all of the strata so that no assumptions are made with respect to the recurrence process. However, this may result in substantial efficiency loss because it ignores the obvious dependency structure, in that the (j+1)th recurrent time must exceed the jth.
On the other hand, Lee, Wei, and Amato [20] proposed a model (LWA model) that assumes a common baseline hazard, where the hazard can be written as
The same subjects can enter several risk sets simultaneously, although its unnaturalness is discussed at the same time.
In terms of the inference of the parameter vector, the use of robust variance, which can handle intrasubject correlations, is considered to be desirable for all models described above (AG, PWP, WLW, and LWA models). Regarding the parameter estimation of the PWP, WLW, and LWA models, each recurrence is assumed to have a common effect in this study.
The Poisson regression model fits the framework of the generalized linear models in which, the response variable, which is the number of occurrences of the event in a fixed time interval, follows Poisson distribution. Let μ(X) be the expected value of the number of relapses; N(X), the total observation period; λ(X), the constant relapse rate of MS; X, the covariate vector; and β, the parameter vector to be estimated. The relapse rate can then be written as
Thus, μ(X) = N(X)exp(X'β).
The relapse rate is not necessarily constant throughout the observation period; it is better to partition the time axis into intervals of constant rates.
Consequently, the intrasubject correlation of the relapse rate among the intervals can be discussed in terms of GEE. GEE is an extension of generalized linear models and regards a subject as a cluster so that the treatment effect can be estimated considering the correlation structure among response variables [21]. It is expected to be a flexible method for analyzing recurrent event data because it can be used even if many of the aforementioned assumptions regarding the proportional hazard models do not hold. In this study, the GEEPoisson model was applied, and intervals were set at 6 months, each with the common rate.
Simulation study
To determine which model is the most suitable for analyzing MS clinical trial data, we conducted the simulation study with consideration of the disease progression process or natural history. When performing a simulation study, we should examine the event generation process, which might be suited to the situation of the disease progression process [22]. The data generation process of this study was as follows. In a hypothetical randomized controlled clinical trial with placebo (n = 100) and active (n = 100) groups, we assumed that each patient had 10 hypothetical latent lesions in their brain and that the lesions were in the inactive phase at the entry time. The recurrence time was recorded after each lesion developed to the active phase. This setting modeled some MS pathological characteristics, such as time and spatial distribution of the lesions. The total followup period was set to 3 years, and the censoring time, which was assumed to be independent from the recurrence time, was generated using a Weibull distribution S(t) = exp (λt^{γ}) with the shape parameter γ = 2.1399 and the scale parameter λ = 0.000000576. The time to recurrence was also generated using a Weibull distribution, and 2 different scenarios were considered.
Scenario 1: All patients have individual identical Weibull distribution parameters, γ = 1.1452 and λ = 0.00141.
Scenario 2: Mixture population of 3 different sets of parameters; 46% of the population has γ = 1.2442 and λ = 0.000604, 45% of the population has γ = 1.1550 and λ = 0.001578, and 9% of the population has γ = 1.9694 and λ = 0.0000661.
The parameters used in our simulation study were calculated from other clinical trial data in Japan [23], especially for the placebo group, which can be regarded as the natural history cohort [24]. The mixture proportions described in Scenario 2 were obtained from the distribution of the number of recurrences in the year preceding the study, which was one of the important covariates. The hazard ratio (relapse rate ratio) of the active group to the placebo group was set at 1/1.3, such the true value of the Cox regression parameter (loghazard ratio) was log(1/1.3) = 0.26236. Each simulation was repeated 1000 times, and the results were evaluated via the bias and mean square error (MSE). The bias is the difference between the estimated and true (or reference) values; thus, the treatment effect would be underestimated if we obtained positive bias and, overestimated if we obtained negative bias. The MSE considers both bias and variability as gauged by the variance of parameter estimates.
After the simulation study, the models were applied to the IFN beta real clinical trial data separately. All statistical analyses, including the simulation study, were performed using SAS version 9.1.3 (SAS Institute Inc., Cary, NC, USA). The SAS sample programs for the use of these models are shown in the appendix of this article (Additional file 1), and a dummy data set is used to clarify the use of the models.
Additional file 1. Appendix: SAS programming codes (example). The SAS sample programs for the use of the regression models shown in this study are provided using a dummy data set.
Format: PDF Size: 171KB Download file
This file can be viewed with: Adobe Acrobat Reader
Results
The results of the simulation study are presented in Table 1. AG, Poisson, and GEEPoisson models showed similar results, with positive bias and relatively small MSE in both scenarios. Almost no bias was detected in PWPT, PWPG, and LWA models in Scenario 1, whereas they showed larger bias in Scenario 2. In the WLW model, a relatively large bias with a negative direction was noted, indicating that an overestimation of the treatment effect and a large MSE were detected in both scenarios.
Table 1. Bias and MSE from the simulation study
The various aforementioned models were then applied to the real clinical trial data introduced in the Methods section with adjustment for some important covariates, such as sex, age, EDSS score at entry time, total area of MS lesions on the MRI scan at the entry time, and number of recurrences in the year preceding the study. Table 2 shows the results of the analysis. The hazard ratio indicates the relative risk of the highdose group to the lowdose group, and it is distributed from 0.613 (WLW model) to 0.769 (timetofirstevent Cox model). All models except the timetofirstevent Cox model showed a significant effect of highdose IFN beta1b. The standard error of the WLW model was the largest, while the PWPT and PWPG models showed relatively small values. The width of the confidence intervals of the AG, PWPT, PWPG, WLW, LWA, Poisson, and GEEPoisson models was smaller than that of the timetofirstevent Cox model.
Table 2. Estimates of treatment effects for MS clinical trials in Japan
Regarding the behavior of the other covariates besides the IFN beta1b variable, "the number of recurrences in the year preceding the study" showed significant differences in all 8 models. As the number increased, the hazard of recurrence in the study increased (range of hazard ratio among the 8 models: 1.1641.375).
Discussion
Because MS is a heterogeneous disease with a variety of subtypes and transitional cases, it is not easy to evaluate drug efficacy. By conducting a simulation study and applying it to real clinical trial data, we examined various extended Cox regression models and a Poisson regression model using GEE, which can handle recurrent events  not only the number of recurrences or the time to the first event, but all recurrences that occurred during the followup period. With the use of the extended models, significant effects were detected and the importance of utilizing more than 1 recurrent time was suggested by our analyses.
From the simulation study results, treatment effect was relatively overestimated in the WLW model. The same tendency was observed in the analysis of the real data; the WLW model showed the smallest hazard ratio. This overestimation tendency might not be desirable, especially in confirmatory trials. The bias and MSE of the LWA model in Scenario 1 were small for homogeneous population because of the similarity of the assumption of the data generation process; however, in Scenario 2, both bias and MSE became larger to some extent for heterogeneous population. In terms of MSE of the PWPT and PWPG models, totally preferable results were obtained, but the bias differences between Scenarios 1 and 2 for each model were approximately 2 times larger than those of the other models; this finding suggests of possible unstable features in the PWP models. For the timetofirstevent Cox model, AG model, Poisson regression model, and GEEPoisson model, no extreme differences were found.
We are then left to select the best model for our data. All models have their own assumptions and characteristics, and so, our decision must consider the nature and system of disease progression that we have analyzed in advance in order to make the correct choice. When we consider the pathological condition in MS, such as the time and spatial distribution of latent lesions, the LWA model seems to be reasonable because of the assumption of a common baseline hazard, which means that each latent lesion has the same risk of development. If we can assume that all subjects have the same number of lesions that can develop at the same risk, the LWA model becomes conceivable. In the same way, if we can assume that all subjects have the same number of lesions that can develop at different risks, the WLW model seems to be best fitted. However, such settings would be unrealistic. In addition, the precision of the estimates in the WLW and LWA models is relatively poor. As the number of lesions increases, the number of strata also increases, which might lead to unstable estimates.
If we assume that the independent increments for all events are even among subjects, then the AG model is reasonable; however, this assumption would be unnatural in this case. PWP models would involve a similar situation despite small standard errors. In fact, the martingale residuals, which enabled us to examine the increment dependency, showed negative slopes throughout the period. This suggests that the assumptions that the AG and PWP models required would not exactly hold. The estimates of the Poisson regression and GEEPoisson models were quite similar, so the advantage of using the GEEPoisson model was not entirely clear in our study. However, if we had had a longer followup period and more time intervals, the method that accounts for the intrasubject correlation structure among intervals would be the more attractive model.
Realistically speaking, the assumptions needed in the extended Cox regression models (AG, PWP, WLW, and LWA models) would be difficult to be completely examined because of the uncertainty of MS pathological and/or clinical deterioration mechanisms and that fact that no one can prove the correctness of these assumptions. The Poisson regression and GEEPoisson models are free of such assumptions. Moreover, in terms of the advanced nature regarding consideration of intrasubject correlation for recurrences in the GEEPoisson model, the GEEPoisson model is preferred over the Poisson regression model. However, further study regarding the behavioral characterization among these models is still needed.
Conclusions
Our results indicate that the use of alternative models that include recurrence event data, especially the GEEPoisson model, may provide better analysis for estimating the treatment effect.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
MNM, TY and YO participated in the design of the study. MNM performed the statistical analysis and drafted the manuscript. All the authors read and approved the final manuscript.
Acknowledgements
The authors are grateful to the pharmaceutical company for providing the multiple sclerosis clinical trial data.
References

The IFNB Multiple Sclerosis Study Group: Interferon beta1b is effective in relapsingremitting multiple sclerosis. I. Clinical results of a multicenter, randomized, doubleblind, placebocontrolled trial.
Neurology 1993, 43:655661. PubMed Abstract

Saida T, Tashiro K, Itoyama Y, Sato T, Ohashi Y, Zhao Z: Interferon beta1b is effective in Japanese RRMS patients: a randomized, multicenter study.
Neurology 2005, 64:621630. PubMed Abstract  Publisher Full Text

Kappos L, Polman CH, Freedman MS, Edan G, Hartung HP, Miller DH, Montalban X, Bauer L, Jakobs P, Pohl C, Sandbrink R: Treatment with interferon beta1b delays conversion to clinically definite and McDonald MS in patients with clinically isolated syndromes.
Neurology 2006, 67:12421249. PubMed Abstract  Publisher Full Text

Carroll WM: Clinical trials of multiple sclerosis therapies: improvements to demonstrate longterm patient benefit.
Mult Scler 2009, 15:951958. PubMed Abstract  Publisher Full Text

The PRISMS Study Group: Randomised doubleblind placebocontrolled study of interferon beta1a in relapsing/remitting multiple sclerosis.
Lancet 1998, 352:14981504. PubMed Abstract  Publisher Full Text

Johnson KP, Brooks BR, Cohen JA, Ford CC, Goldstein J, Lisak RP, Myers LW, Panitch HS, Rose JW, Schiffer RB: Copolymer 1 reduces relapse rate and improves disability in relapsingremitting multiple sclerosis: results of a phase III multicenter, doubleblind, placebocontrolled trial.
Neurology 1995, 45:12681276. PubMed Abstract

Fazekas F, Deisenhammer F, StrasserFuchs S, Nahler G, Mamoli B: Randomised placebocontrolled trial of monthly intravenous immunoglobulin therapy in relapsingremitting multiple sclerosis.
Lancet 1997, 349:589593. PubMed Abstract  Publisher Full Text

Sorensen PS, Wanscher B, Jensen CV, Schreiber K, Blinkenberg M, Ravnborg M, Kirsmeier H, Larsen VA, Lee ML: Intravenous immunoglobulin G reduces MRI activity in relapsing multiple sclerosis.
Neurology 1998, 50:12731281. PubMed Abstract

The Lenercept Multiple Sclerosis Study Group and The University of British Columbia MS/MRI Analysis Group: TNF neutralization in MS: results of a randomized, placebocontrolled multicenter study.
Neurology 1999, 53:457465. PubMed Abstract  Publisher Full Text

Miller DH, Kahn OA, Sheremata WA, Blumhardt LD, Rive GP, Libonati MA, WillmerHulme AJ, Dalton CM, Miszkiel KA, O'Connor PW: A controlled trial of natalizumab for relapsing multiple sclerosis.
N Engl J Med 2003, 348:1523. PubMed Abstract  Publisher Full Text

Therneau TM, Grambsch PM: Modeling Survival Data: Extending the Cox Model. New York: SpringerVerlag; 2000.

Wang YC, Meyerson L, Tang YQ, Qian N: Statistical methods for the analysis of relapse data in MS clinical trials.
J Neurol Sci 2009, 285:206211. PubMed Abstract  Publisher Full Text

Cook RJ, Lawless JF: The Statistical Analysis of Recurrent Events. New York: Springer; 2007.

Kuramoto L, Sobolev BG, Donaldson MG: On reporting results from randomized controlled trials with recurrent events.
BMC Medical Research Methodology 2008, 8:35. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Klein JP, Moeschberger ML: Survival Analysis: Techniques for Censored and Truncated Data. 2nd edition. New York: Springer; 2003.

Andersen PK, Gill RD: Cox's regression model for counting processes: a large sample study.
Ann Stat 1982, 10:11001120. Publisher Full Text

Prentice RL, Williams BJ, Peterson AV: On the regression analysis of multivariate failure time data.
Biometrika 1981, 68:373379. Publisher Full Text

Wei LJ, Lin DY, Weissfeld L: Regression analysis of multivariate incomplete failure time data by modeling marginal distributions.
J Am Stat Assoc 1989, 84:10651073. Publisher Full Text

Lee EW, Wei LJ, Amato DA: Coxtype regression analysis for large numbers of small groups of correlated failure time observations.

Liang KY, Zeger SL: Longitudinal data analysis using generalized linear models.
Biometrika 1986, 73:1322. Publisher Full Text

Metcalfe C, Thompson SG: The importance of varying the event generation process in simulation studies of statistical methods for recurrent events.
Stat Med 2006, 25:165179. PubMed Abstract  Publisher Full Text

Saida T, Ohashi Y, Tashiro K, Itoyama Y, Sato T, Hamaguchi K, Nishitani H, Shibasaki H, Araki S, Research committee for treatment of multiple sclerosis (Chairman: Igata A): Treatment of multiple sclerosis with mizoribine: (1) clinical results of a doubleblind, placebocontrolled study.

Wingerchuk DM, Weinshenker BG: The natural history of multiple sclerosis: implications for trial design.
Curr Opin Neurol 1999, 12:345349. PubMed Abstract  Publisher Full Text
Prepublication history
The prepublication history for this paper can be accessed here: