Abstract
Background
An Individual Patient Data (IPD) metaanalysis is often considered the goldstandard for synthesising survival data from clinical trials. An IPD metaanalysis can be achieved by either a twostage or a onestage approach, depending on whether the trials are analysed separately or simultaneously. A range of onestage hierarchical Cox models have been previously proposed, but these are known to be computationally intensive and are not currently available in all standard statistical software. We describe an alternative approach using Poisson based Generalised Linear Models (GLMs).
Methods
We illustrate, through application and simulation, the Poisson approach both classically and in a Bayesian framework, in twostage and onestage approaches. We outline the benefits of our onestage approach through extension to modelling treatmentcovariate interactions and nonproportional hazards. Ten trials of hypertension treatment, with allcause death the outcome of interest, are used to apply and assess the approach.
Results
We show that the Poisson approach obtains almost identical estimates to the Cox model, is additionally computationally efficient and directly estimates the baseline hazard. Some downward bias is observed in classical estimates of the heterogeneity in the treatment effect, with improved performance from the Bayesian approach.
Conclusion
Our approach provides a highly flexible and computationally efficient framework, available in all standard statistical software, to the investigation of not only heterogeneity, but the presence of nonproportional hazards and treatment effect modifiers.
Background
Metaanalysis methods are used to integrate quantitative findings from a set of related research studies with the aim of providing more reliable and accurate estimates of a treatment effect [1]. Traditionally a metaanalysis requires aggregate data (AD), extracted from publications or received directly from study authors. Summary statistics (e.g. log hazard ratios) are then synthesised using a fixed or random effects metaanalysis [2], where random effects can account for between study heterogeneity in the treatment effect. Metaregression models [3] attempt to explain this excess heterogeneity with studylevel covariates. However, the use of AD to conduct a metaanalysis has inherent problems, for example, hazard ratios are not always explicitly given in publications, leading to the development of alternative techniques to extract appropriate summary statistics [4]. Despite this, even when using the methods of Parmar et al., it can still be difficult to extract hazard ratios, as shown by Riley et al. [5].
An approach often considered the goldstandard alternative to an AD metaanalysis is a metaanalysis of individual patient data (IPD), which utilises the raw data from each study. IPD metaanalyses have been shown to be most common when analyzing timetoevent data [6]. The benefits of conducting an IPD metaanalysis with timetoevent data include: followup time can be longer and more up to date, analyses can be standardised across studies, model assumptions can be checked e.g. proportional hazards, and confounders can be adjusted for. However, IPD can be difficult to obtain, and a variety of methods have been developed to undertake metaanalyses from the published literature of timetoevent data. An early proposal by Dear [7] showed how to jointly synthesise survival proportions reported at multiple times, by utilising their correlation and combining them in a multivariate metaanalysis using generalised least squares. Dear investigated only fixed effects, leading the extension of Arends et al. to incorporate random effects [8]. Techniques to extract summary statistics from published studies have also been demonstated [4] for the use in standard AD metaanalyses. Fiocco et al. recently used a Poisson correlated gammafrailty approach to combine survival curves under heterogeniety, allowing the investigation of both betweenstudy variance and within and betweenarm correlations [9]. A frailty approach has also been implemented by Feng et al. incorporating crossed random effects using penalized quasilikelihood under a Poisson likelihood [10]. Further extensions of AD metaanalyses include assessment of the proportional hazards assumption [9,11]
IPD metaanalyses of timetoevent data can use either a twostage or onestage approach. The most commonly used, the twostage, is achieved by first fitting individual survival models to each trial. The chosen estimates of effect are then combined in a standard metaanalysis framework, now equivalent to an AD metaanalysis. In a onestage IPD metaanalysis, patient data from all studies are analysed simultaneously within a hierarchical framework. This draws parallels with the analysis of IPD from multicentre clinical trials, accommodating clustering within treatment centres; however, in a multicentre trial the treatment effect is not often random, whereas in a metaanalysis it often is. This is because in a multicentre trial we can achieve greater consistency in inclusion/exclusion criteria and other aspects of trial protocol, indicating that a fixed treatment effect is likely to be more appropriate. Senn discusses these issues in more detail [12], but we emphasise that, although randomeffects models are rarely used to analyse multicentre trials, they could also adopt the methods we present here. Indeed, published trial analysis guidelines do state: "mixed models may be used to explore heterogeneity of the treatment effect. These models consider centre and treatmentbycentre effects to be random, and are especially relevant when the number of sites is large" [13]. A range of hierarchical survival models within the Cox framework have been developed [1417], which can effectively account for heterogeneity in treatment effect and baseline risk. However, these methods can have a high computational burden and/or rely on userwritten programs, not currently available in standard statistical software [16]. Furthermore, these models do not investigate the validity of the assumption of proportional hazards. These reasons serve as motivation to consider alternative approaches, such as the percentile ratio [18] as a target of inference in this setting, developed predominantly for when the proportional hazards assumption appears violated.
The aim of this paper is to explore the use of Poisson regression, and the generalised mixed model extensions, to incorporate random effects to perform one and twostage IPD metaanalyses of timetoevent outcomes, as an alternative to hierarchical Cox models, and to extend the models to incorporate nonproportional hazards and treatmenteffect modifiers.
Methods
The Poisson approach to survival analysis
Poisson regression is used in the modelling of count data and contingency tables; however, the extension to modelling survival data via a piecewise exponential model [19] serves as an alternative approach to the widely used Cox model. It has been shown how the Cox model can be fitted using a Poisson GLM due to the shared form of the contribution to the partial loglikelihood, by splitting followup time into as many intervals as there are events [20]. However, this method can be computationally intensive. Alternatively, we can choose a smaller number of time intervals with fixed length, where patients are at risk of experiencing events within these [21], to closely approximate the Cox model. We also obtain direct estimates of the baseline hazard rate. Fine splitting of the timescale has been used to allow the use of splines and fractional polynomials to model the baseline hazard continuously [21,22].
A standard approach when choosing interval lengths is to use yearly splits [23]. The narrower the time interval, the more computationally intensive these methods will be; however, methods to compensate for this are available and described below. The shape of the underlying hazard function plays a crucial role in choosing the number of intervals necessary to successfully capture its variation. In this paper, quarter year, half year and yearly splits are compared.
Undertaking a onestage IPD metaanalysis within a Poisson framework is beneficial due to the highly developed area of GLMs. Random effects GLMs are available within all commonly used statistical software packages (e.g. Stata, SAS and R), allowing models to be applied without the need for specialist software.
Model fitting in a single trial
Consider the analysis of timetoevent data from a single trial, investigating the effect of a treatment. For the i^{th }patient, let x_{i }denote treatment group, coded 0/1 to denote control/treatment. A standard Cox proportional hazards model can be applied (and estimated by maximising the partial likelihood [24]):
where h_{0}(t) is the unspecified baseline hazard rate and β_{1 }the log hazard ratio (i.e. the treatment effect) for the treatment group compared to the control group. By splitting followup time into k = 1,...,K intervals and assuming a constant hazard within each interval we can apply the Poisson model:
where d_{ik }is the event indicator, taking the value of 0 or 1 (censored or event), representing a Poisson process for each patient during each time interval. Note that d_{ik }will not follow a Poisson distribution per se, but by doing so we recover the correct form of the likelihood for a piecewise exponential model. β_{1 }is once again the log hazard ratio for the treatment group compared to the control group. λ_{k }is the baseline hazard rate during the k^{th }time interval. Time at risk, y_{ik}, is included as a log offset in the linear predictor. If we split followup time at each unique event time and apply the Poisson model, we would obtain an identical estimate of the treatment effect, β_{1}, to that from a Cox model.
Twostage IPD metaanalyses models for survival data
The twostage approach can be thought of as more traditional, with individual survival models applied to each trial, and appropriate summary statistics extracted to allow AD metaanalysis techniques to be applied.
We extract from the j^{th }trial: the log hazard ratio for the treatment group compared to the control group, , and its variance , using either Cox or Poisson models, which can then be combined in a standard AD metaanalysis. Such AD metaanalysis models include a fixed effect model, where we assume all trials are estimating the same true treatment effect, applied for example using the inverse variance weighted method [1]. Alternatively, a random effect model can be applied where we assume that each estimate of the treatment effect comes from a distribution of treatment effects, with mean β_{1 }and variance τ^{2}. Following a random effect metaanalysis, a prediction interval can be calculated for the predicted treatment effect in an individual study, to help show the potential impact of betweentrial heterogeneity [25,26].
Onestage IPD metaanalyses models for survival data
We now describe onestage IPD metaanalyses models using the framework of proportional hazards models. The following models, if fitted using the Cox proportional hazards model, correspond to those developed by TudurSmith et. al. [14], which are estimated by maximising the penalized partial likelihood to find the best linear unbiased predictors, from which the REML estimators of the variance components were found [27].
Model A: Fixed treatment effect with proportional trial effects
For the i^{th }patient, i = 1,...,n_{j}, in the j^{th }trial, j = 1,...,J, the hazard function at time t can be written as:
where h_{0}(t) is the baseline hazard function in the reference trial (say j = 1, so β_{01 }constrained to be zero). β_{0j }is the proportional effect on the baseline hazard function due to the j^{th }trial, now j = 2,..., J. x_{ij }is coded 0.5/0.5 to denote control/treatment group and β_{1 }is the log hazard ratio for the treatment group compared to the control group, assumed equal across all trials. Model A makes the restrictive assumption that the hazard functions in all trials are proportional to a common baseline function.
The treatment group coding of 0.5/0.5 is used in all onestage models presented in this paper. Using this coding of the treatment group indicator, we assume equal variability in the log hazard rate across trials for both treatment groups. If we chose the 0/1 coding, this imposes the restrictive assumption that the variability in the log hazard rate of the treatment group coded 0, is zero [14,28].
Model B: Fixed treatment effect with baseline hazard stratified by trial
In reality, the assumption that the hazard functions in all trials are proportional is likely to be inappropriate. By allowing separate baseline hazard functions for each trial we can relax this assumption, whilst still assuming proportional hazards between treatment groups within each trial. Allowing separate baseline hazards, we have:
where h_{0j}(t) is the baseline hazard function in the j^{th }trial. As in Model A, β_{1 }represents the log hazard ratio for the treatment group compared to the control group, assumed constant across trials. No allowance for between study variation in the treatment effect is made in Models A and B.
Model C: Random treatment effect with proportional trial effects
Models which allow for betweentrial heterogeneity in the treatment effect are now considered. The following formulations assume an underlying mean treatment effect, coming from a population of treatment effects. The hazard function for the i^{th }patient in the j^{th }trial can be written as:
where h_{0}(t) is the baseline hazard function in the reference trial (say j = 1, so β_{01 }constrained to be zero). β_{0j }is the proportional effect on the baseline hazard function due to the j^{th }trial, now j = 2,...,J. β_{1 }is now interpreted as the mean log hazard ratio for a population of treatment effects, with b_{1j }the deviation of the log hazard ratio in the j^{th }trial from the population mean. This assumes that the b_{1j}'s come from a Normal distribution with mean zero and variance τ^{2}. This formulation produces a measure of the betweentrial heterogeneity in the treatment effect, τ^{2}.
Model D: Random treatment effect with baseline hazard stratified by trial
Finally, separate baseline hazards are allowed, with a random treatment effect:
where h_{0j}(t) is interpreted as in Model B, with β_{1}, b_{ij }and τ^{2 }defined in Model C. Model D, as in Model B, assumes proportional hazards across treatment groups only within trials.
Models A to D, within a hierarchical Cox framework, were applied by TudurSmith et al. [14] to IPD data from 5 trials comparing 2 antiepileptic drugs with timetoevent outcome first seizure. A total of 1225 patients were analysed. To illustrate the computational burden of hierarchical Cox models, the application of Model C took 29 hours to achieve convergence, whilst the application of Model D took 53 minutes to achieve convergence.
The Poisson approach to onestage IPD metaanalysis models of survival data
We now introduce Poisson based GLM formulations of the models shown above. Techniques to increase the computational efficiency of the models are described in Section titled "Model fitting" below.
Onestage IPD Poisson generalised linear survival models
Models A and C: Fixed/random treatment effect with proportional trial effects. For time intervals, k = 1,...,K, we now have:
where λ_{k }represents the constant hazard rate in the k^{th }interval for the control group, in the reference trial.
Models B and D: Fixed/random treatment effect with baseline hazard stratified by trial. Models B and D are similarly altered. For trials, j = 1,...,J, and time intervals, k = 1,...,K, we can write the baseline hazard function as:
where λ_{jk }represents the constant hazard rate in the j^{th }trial during the k^{th }time interval.
Model fitting
We present Model A in the form of a Poisson GLM:
where d_{ijk }is the event indicator, taking the value of 0 or 1 (censored or event), representing a Poisson process for each patient in each trial during each time interval. β_{0j }and β_{1 }are as in Model A, with λ_{k }once again the hazard rate in the control group of the reference trial. Time at risk, y_{ijk}, is included as a log offset in the linear predictor. The extension to separate trial effects can be achieved by simply replacing the linear β_{0j }and λ_{k }terms with the interaction of them, i.e. Model B.
Fixed effect Models A and B can be implemented using any GLM software package, such as glm within Stata [29]. Models C and D, with random treatment effects, can be implemented using a multilevel mixed effects Poisson regression package, such as Stata's xtmepoisson.
It is widely known that within a mixed effects framework, maximum likelihood performs poorly when estimating variance parameters when there are a small number of studies [28]. This provides motivation for considering a Bayesian approach to the models discussed above, described and undertaken in the simulation study and results sections below.
If we have N independent Poisson distributed random variables, each with mean λ, then the sum of these N distributions is itself a Poisson distributed random variable with mean Nλ. Given this condition, it is possible to 'collapse' each split dataset across covariate patterns (for example, separately collapse the dataset for males and females) [30]. A Poisson GLM model can then be fitted to the collapsed dataset, giving identical parameter estimates to a Poisson GLM fitted to the noncollapsed dataset. This process dramatically reduces computation time when datasets are large; however, is only valid when categorical covariates are used. It is not possible to collapse across covariate patterns when including truly continuous covariates.
When handling sparse event data, the situation may arise when no events occur within a split time interval. In this case, when applying the models described in this section, we obtain nuisance estimates of the baseline hazard rate for any time interval in which no events occur. This can be remedied by the merging of time intervals.
Simulation study
To fully assess the performance of these methods a simulation study was devised. Data is simulated consisting of a random treatment effect and proportional trial effects. We investigate the impact of the number of studies and time interval length by simulating either 5, 10 or 30 trials, and applying Poisson onestage models with time intervals of length 0.25, 0.5 or 1 year. Each trial is simulated under the following steps:
1. Generate 2000 patients; 50% assigned to treatment, 50% to control.
2. Simulate a random treatment effect (on the log scale) with mean, α = 0.4, and inherent betweentrial heterogeneity, τ = 0.2. Therefore β_{1 }~ N(0.4,0.2^{2}), indicating a 33.0% (95% CI: 0.8%, 54.7%) reduction in the event rate due to treatment.
3. Generate a fixed trial effect, β_{0 }~ N(0,0.5^{2}), again on the log scale.
4. Generate survival times from a Weibull distribution using a formulation proposed by Bender et al. [31]. Scale and shape parameters were defined as λ = 0.042 and γ = 1.2, respectively. These values are based on fitting a Weibull survival model to the SHEP trial. All observations are censored after 5 years. This produces a 74.8% and 82.4% survival proportion after 5 years in the control and treatment groups, respectively.
This results in 9 scenarios, in which 1000 repetitions were simulated. For each simulated dataset, Model C was applied both classically using xtmepoisson within Stata, whilst WinBUGS, through the use of winbugsfromstata[32], was used to apply the equivalent Bayesian model. Each Bayesian model was applied with a burnin of 1000 and sample of 5000. This was deemed adequate to achieve convergence through extensive testing of the simulations. Vague priors were assigned to all parameters under the Bayesian approach. The treatment group indicator was coded 0.5/0.5.
Extensions to the onestage approach
Treatment effect modifiers
It is becoming increasingly accepted that variation in treatment effects, as a source of heterogeneity, can only be sufficiently detected and explained when IPD are available [33]. IPD allows one to examine covariates and withintrial interactions at the patientlevel. In contrast, metaregression of only AD allows one to examine studylevel covariates and interactions acrosstrials, and this has been shown to have low power to detect true interactions between patient covariates and treatment effect [34], and may also be subject to ecological bias and study level confounding [35].
The discrimination between withintrial and acrosstrial treatmentcovariate interactions is a current issue in IPD metaanalysis [35,36], which requires further work within the survival analysis field. Below we present a simple onestage model which produces a weighted average of the within and acrosstrial interactions, though in our applied example the withintrial interaction dominates.
Fixed treatment effect with separate trial effects
Let w_{ij }be a patientlevel covariate, e.g. overweight status (coded 0/1 for no/yes, see Table 1) for the i^{th }patient in the j^{th }trial. Extending Model B to incorporate a treatmentcovariate interaction gives:
Table 1. Summary statistics for the IPD metaanalysis investigating effectiveness of antihypertension drugs
where λ_{jk }is the baseline hazard rate during the k^{th }time interval in the j^{th }trial, β_{1 }now represents the treatment effect when w_{ij }= 0, μ is the change in the log hazard rate of the control group for a oneunit increase in w_{ij }and γ is the change in the treatment effect for a oneunit increase in w_{ij}.
Nonproportional hazards for the treatment effect
It has been shown that the benefits of a treatment can be deemed greater during the initial period of followup time in certain contexts [37]. In this situation, an assumption of proportional hazards for the treatment effect will be violated. In other words, a beneficial treatment effect may diminish with time. We describe a simple approach of investigating the presence of nonproportional hazards in the treatment effect, which can be extended to any covariate within the model.
Fixed treatment effect with separate trial effects
Extending Model B, we first dichotomise followup time at time t_{s}, and define a variable, z_{ijk}, which takes the value 0 if t < t_{s }or 1 if t ≥ t_{s}.
now represents the log hazard ratio for the treatment group compared to the control group when t < t_{s}, with the change in the log hazard ratio when t ≥ t_{s}, relative to when t < t_{s}. The estimated log hazard ratio for the treatment group compared to the control group when t ≥ t_{s }is therefore a linear combination; . The inclusion of nonproportional hazards can be investigated using the likelihood ratio test, comparing with Model B.
This can be extended by further splitting of followup time; however, the time variable, z_{ijk}, would generally be assumed to have fewer intervals than those used to model the baseline hazard rate. Extension to include a timevarying treatment effect in Models A, C and D is easily conducted.
The hypertension data
The example dataset used to illustrate the models in this paper comes from an IPD metaanalysis investigating the effects of antihypertension drugs in lowering systolic and diastolic blood pressure as determinants of cardiovascular outcomes [38]. Randomised controlled trials (RCTs) were selected on the availability of IPD and the comparison of an active treatment to a placebo or control. This resulted in the inclusion of 10 trials consisting of 28,581 patients. Metaanalysis is important to summarise the average treatment effect, and any heterogeneity in the treatment effect, across these different trials, and it enables a broader assessment of the effects of hypertension treatments than is possible in a single trial alone.
Summary statistics for the timetoevent outcome allcause death and an overweight covariate are presented in Table 1. Overweight status is a binary covariate, coded 0/1 for no/yes, dichotomising BodyMass Index (BMI) at 25 kg/m^{2}. Detailed summary statistics can be found in the original metaanalysis [38].
Results
Single trial application
Comparing approaches, we apply a proportional hazards model investigating the effect of the treatment. The SHEP trial is used as an example, with outcome allcause death. Estimated hazard ratios for the treatment effect are presented in Table 2. We observe complete agreement in estimates and 95% confidence intervals across models, showing a nonstatistically significant reduction of 8.7% (95% CI: 10.1%, 24.3%) in the hazard of death for patients in the antihypertension treatment group compared to those in the control group.
Table 2. Estimates of treatment effect in the SHEP trial
Twostage IPD metaanalyses models for survival data
We now apply twostage random effects metaanalyses models to the hypertension data. In the first step we compare the Cox and Poisson models to obtain the estimates of the treatment effect in each trial, and associated variance . The second step is then conducted using the random effects AD metaanalysis model of DerSimonian and Laird [2].
Table 3 shows the estimates of the pooled hazard ratio. All 4 models produce consistent estimates of the pooled treatment effect, showing a 12% (95% CI: 2.6%, 20.4%) reduction in the hazard of death for patients in the active antihypertension treatment group compared to those in the control. No evidence of heterogeneity was found , indicating in this case a fixed effect model would suffice and would yield identical estimates. Forest plots from the twostage metaanalyses using Cox models and Poisson models with 0.5 year splits are shown in Figures 1 and 2, respectively, illustrating the consistent estimates of the treatment effect at both the trial and metaanalysis level.
Table 3. Results from twostage random effects metaanalyses.
Figure 1. Twostage metaanalyses with outcome allcause death. Cox models are used in the first step.
Figure 2. Twostage metaanalyses with outcome allcause death. Poisson GLMs are used in the first step.
Onestage IPD metaanalyses models for survival data
We now apply each of the models described in the methods section "Onestage IPD metaanalyses models for survival data" to the hypertension data, using the Poisson method both classically and under a Bayesian approach. Further comparison of Models A(fixed treatment and fixed proportional trial effects) and B (fixed treatment and baseline stratified by trial) are made using Cox proportional hazards models, under a classical approach. Under Bayesian Models A, B, C and D all parameters are assigned a vague prior of N(0,1000^{2}), excluding the heterogeneity parameter in Models C and D, where τ ~ N(0,1) with τ > 0. A burnin of 1000 was used, with 100,000 samples and thinning at every 20th sample to remove autocorrelation.
Estimates of the treatment effect and 95% confidence/credible interval are seen in Table 4. Comparing estimates obtained under classical Cox formulations of Models A and B with equivalent Poisson models, we observe almost identical estimates of the treatment effect and 95% confidence intervals for each time interval length. For example, under all 4 classical onestage IPD metaanalysis models with fixed treatment effect and proportional trial effects, we observe a 12.3% (95% CI: 3.0%, 20.7%) reduction in the hazard of death for patients in the active antihypertension treatment group compared to those in the control group. Consistent estimates of the treatment effect are obtained across all 3 choices of time interval.
Table 4. Estimates of the treatment effect from applying Models A to D both classically and under a Bayesian approach
Each mixed effects model also produces an estimate of heterogeneity in the treatment effect, seen in Table 5. Stark contrasts in estimates of τ can be seen between classical and Bayesian approaches to both Models C and D. For example, under a classical onestage Poisson model (with time intervals of 1 year) with random treatment effect, stratified by trial, we obtain an estimate of heterogeneity of τ = 5.92E09 (95% CI: 0, .), compared to the equivalent Bayesian models estimate of τ = 0.081 (95% Cred. Int.: 0.004, 0.310). The classical model has estimated τ to be essentially zero, and consequently failed to construct a 95% confidence interval.
Table 5. Estimates of heterogeneity from applying Models C and D both classically and under a Bayesian approach
To illustrate the computational efficiency of the method, using interval lengths of 1 year; application of Models C and D to collapsed data under a classical approach took 4.6 seconds and 60 seconds, respectively, to achieve convergence. Under a Bayesian approach the equivalent models took 64 seconds and 63 seconds, respectively, to complete the sampling.
Example code to fit Model C both classically within Stata, and under a Bayesian approach in WinBUGS [39] can be found in the Appendix.
Simulation results
Results from the simulation study, detailing mean estimates and coverages of the treatment effect and heterogeneity can be found in Tables 6 and 7, respectively. From Table 6, the treatment effect estimates appear consistent across classical and Bayesian frameworks for each model. A scatter plot matrix can be seen in Figure 3, further illustrating agreement between classical and Bayesian estimates. Coverage improves as the number of trials increase; however, within the classical models coverage is much less informative due to the moderate downward biases seen in the estimates of heterogeneity in Table 7. There is clear evidence that, irrespective of the number of trials or interval length, the classical mixed effects models consistently underestimate the true underlying heterogeneity of τ = 0.2. Estimates from the Bayesian models are generally less biased. Figure 4 shows a scatter plot matrix comparing classical and Bayesian estimates of τ, illustrating the classical approach consistently producing lower estimates of τ, compared to the Bayesian approach.
Table 6. Results of simulation study.
Table 7. Results of simulation study.
Figure 3. Scatter plot matrix comparing classical and Bayesian estimates of treatment effect. True value, α = 0.4.
Figure 4. Scatter plot matrix comparing classical and Bayesian estimates of betweenstudy standard deviation. True value, τ = 0.2.
We also conducted the simulations described above using a treatment group coding of 0/1. The estimates of heterogeneity from the classical model had much larger downward bias. For example, when using 0.5 year intervals, estimates of τ for 5, 10 and 30 studies were 0.112, 0.138 and 0.165, respectively when using the 0/1 treatment coding, compared with 0.147, 0.176 and 0.193 seen in Table 7 for the 0.5/0.5 coding. Estimates under a Bayesian approach remained consistent with those seen in Table 7.
We extended the simulation study to include application of Model D (random treatment effect with baseline hazard stratified by trial) to data simulated as described above. Unfortunately, due to excessive computation time, it proved infeasible to conduct the simulation study on all 9 scenarios. For example, a single run of the scenario including 10 trials with 0.25 year splits takes approximately 32 minutes. However, the 5 trial scenarios were completed and showed entirely consistent results to those described above. The computational difficulties are exclusively due to the classical approach, as each Bayesian model takes only seconds to execute the required number of MCMC samples.
Onestage approach extensions
Treatment effect modifier
We apply Model (10), both classically and in a Bayesian framework, to the hypertension data to examine whether treatment effect is modified by being overweight (as defined by a BMI value ≥ 25). Note we dichotomise BMI to illustrate the methodology here, but in practice continuous variables are better analysed on their continuous scale. All parameters in the Bayesian approach use the vague prior N(0,1000^{2}). Results are shown in Table 8. We observe almost identical estimates across classical models for each of the parameters of interest. When a patient is not overweight, all classical models predict a treatment effect reducing the mortality rate by approximately 14.2% (95% CI: 0.1%, 26.4.4%, 21.6%) in the hazard of death for overweight patients in the active antihypertension treatment group compared to those in the control. Being overweight is estimated to produce a 27.4% (95% CI: 16.5%, 37.0%) reduction in the mortality rate, with treatment group held constant. The equivalent Bayesian models produce almost identical estimates of effect compared to the classical models. Using the approach of Riley et al. [36] we also separated withinstudy and betweenstudy interactions but it did not change these findings.
Table 8. Onestage IPD metaanalyses investigating the interaction between treatment and overweight status
Nonproportional hazards
We now apply Model (11) to the hypertension data, letting t_{s }= 1. Results are presented in Table 9. From the classical models, a statistically significant (at the 5% level) 34.3% (95% CI: 16.2%, 48.5%) reduction in the hazard of death for patients in the active antihypertension treatment group compared to those in the control is observed in the first year of followup. The treatment effect after the first year is calculated by exp (β_{1 }+ ϕ). This produces a nonsignificant reduction of 6.4% (95% CI: 4.5%, 16.2%) in the hazard of death for patients in the active antihypertension treatment group compared to those in the control, showing evidence of a diminishing treatment effect. Figure 5 illustrates this change by plotting the piecewise constant hazard rate in each treatment arm for the COOP trial. Extension to incorporate a random treatment effect is also possible.
Discussion
The importance of having IPD available has been established, allowing a full exploration of betweenstudy heterogeneity [34] and the verification of model assumptions. By obtaining IPD, computational issues may become apparent with the sheer size of patient data being analysed when incorporating random effects. This issue is clearly highlighted when using other large datasets within the hierarchical Cox framework [14]. However, it should be noted that a variety of techniques have been developed to investigate heterogeneity and nonproportional hazards, for example, when combining aggregate level data from published studies [4,79,11].
In this paper, our aim was to illustrate an effective alternative to hierarchical Cox models, minimising computational issues and providing further interpretational benefits. Through minimal splitting of followup time, reliable estimates of effect can be obtained. Choice of interval lengths will depend on the underlying shape of the hazard function; however, the hazard ratio may be insensitive to the baseline, as illustrated by consistent estimates of the treatment effect across the 3 choices of interval length used in this paper. By combining the Poisson approach with the collapsing technique described above, we can dramatically reduce computation time. When analysing data with rare events, such models may be further advantageous through the need of less intervals. Differential followup times between trials can also be accounted for through this approach. Our approach provides direct estimates of the baseline hazard rate which is clinically important. These estimates allow the calculation of risk differences, or number needed to treat [40]. However, a limitation of our approach is that the collapsing technique described cannot be used with truly continuous covariates, such as age measured in days.
Investigation of random treatment effect models showed a marked underestimation of heterogeneity under the classical approach. This may in fact be explained by the tendency of maximum likelihood to underestimate variance parameters [28]. Under the Bayesian approach we observed improved performance, with comparatively lower absolute biases; however, it must be noted that, given the nature of the MCMC algorithm, the Bayesian approach will always provide a positive estimate of between study heterogeneity. A recent simulation study emphasised the need for care when choosing noninformative priors on variance parameters [41], which has specific relevance when investigating heterogeneity in the treatment effect, as in Models C and D. An alternative estimation procedure, such as hlikelihood [42], could be investigated.
It must be noted that if purely interested in a pooled treatment effect, then there is no advantage in pursuing a onestage over a twostage approach; however, investigation of treatment effect modifiers and modelling assumptions should be conducted simultaneously, which can only be done effectively through a onestage approach. Although previous work has provided effective methods to investigate heterogeneity in the metaanalysis setting [1417], we feel our approach provides a highly simplistic alternative which can incorporate the investigation of nonproportional hazards in covariate effects, and that of treatmenteffect modifiers, both of which should be considered in any IPD metaanalysis.
In our analysis of the hypertension dataset, we observed a 27.4% (95% CI: 16.55, 37.0%) reduction in the mortality rate when a patient is overweight compared to a nonoverweight patient, with treatment group held constant. Although this is a surprising result, it is one that has been identified previously [43]. Previous work by one of the authors of this article has also observed this relationship between BMI and mortality; however, further identified that the true factor lowering risk is height, i.e. lower risk is seen for overweight patients because they tend to be taller [44].
The approach detailed in this paper has the further benefit of allowing adjustment for confounders to be implemented simply. This becomes important when analysing IPD from observational studies, where the need to adjust for confounders is often paramount [45].
The flexibility of the Poisson approach described may be extended through the use of splines to model not only the baseline hazard, but also any timedependent effects [21]. This would result in more plausible predictions, allowing a continuous function estimate of both.
Finally, we recognise that the IPD approach does not necessarily solve all the problems for metaanalysis [46]; in particular, IPD may not be available from all the studies requested. In this situation a sensitivity analysis may be needed to examine whether IPD metaanalysis conclusions remain robust when aggregate data from nonIPD studies are additionally included as far as possible [35].
Conclusion
For an IPD metaanalysis of survival data, our approach provides a highly flexible and computationally efficient framework. The methods are available in all standard statistical software, allowing the investigation of not only heterogeneity, but the presence of nonproportional hazards and treatment effect modifiers.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
PL and RR conceived the project. MC carried out the analyses and conducted the simulation study. MC drafted the paper which was later revised by PL and RR through substantial contributions to the contents of the paper. JS, JW and FG were involved in conception, design and acquisition of the hypertension data. All authors read and approved the final manuscript.
Appendix
A.1. Model C: Random treatment effect with proportional trial effects
Classical model within Stata:
. *load data
. use hyperdata, clear
. *stset the data
. stset fudy, failure(death = 1) id(idnr) exit(time 5)
. *create time intervals by splitting at every year
. stsplit sp, every(1)
. egen spgrp = group(sp)
. *generate offset
. qui gen y = _t_t0
. *collapse across covariate patterns
. collapse (min) start = _t0 (max) end = _t (count) n = _d (sum) y _d, by(spgrp treat trial)
. *fit mixed effects Poisson model with random treatment effect
. xtmepoisson _d i.treat i.trial ibn.spgrp, exposure(y) nocons irr  trial: treat, nocons
Bayesian model within WinBUGS:
model{
for (i in 1:N){
d[i] ~ dpois(mu[i]) #likelihood
log(mu[i]) <  alpha[trial[i]]*(treat[i]0.5) + beta[trial[i]] + gamma[spgrp[i]] + log(y[i])
}
beta [1] <  0
### Priors ###
for (s in 1:J){
alpha[s] ~ dnorm(a,tau)
}
a ~ dnorm(0,1.0E6)
tau <  1/var
var <  pow(sd,2)
sd ~ dnorm(0,1)I(0,)
#Trial id:
for (p in 2:J){
beta[p] ~ dnorm(0.0,1.0E6)
}
#Intervals:
for (q in 1:ints){
gamma[q] ~ dnorm(0.0,1.0E6)
}
### Hazard ratio due to the treatment effect:
expalpha <  exp(a)
}
Acknowledgements
The authors would like to thank Lutgarde Thijs for her helpful comments and contribution to the management of the hypertension dataset. Michael Crowther is funded by a National Institute of Health Research (NIHR) Methods Fellowship (RPPG040710314). Richard Riley is supported by the MRC Midlands Hub for Trials Methodology Research, at the University of Birmingham (Medical Research Council Grant ID G0800808). We thank the referees for their comments, which have greatly improved the paper.
References

Sutton AJ, Abrams KR, Jones DR, Sheldon TA, Song F: Methods for MetaAnalysis in Medical Research. London: John Wiley; 2000.

DerSimonian R, Laird N: Metaanalysis in clinical trials.
Control Clin Trials 1986, 7:177188. PubMed Abstract  Publisher Full Text

Thompson SG, Higgins JPT: How should metaregression analyses be undertaken and interpreted?
Stat Med 2002, 21(11):15591573. PubMed Abstract  Publisher Full Text

Parmar MK, Torri V, Stewart L: Extracting summary statistics to perform metaanalyses of the published literature for survival endpoints.
Stat Med 1998, 17(24):28152834. PubMed Abstract  Publisher Full Text

Riley RD, Abrams KR, Sutton AJ, Lambert PC, Jones DR, Heney D, Burchill SA: Reporting of prognostic markers: current problems and development of guidelines for evidencebased practice in the future.
Br J Cancer 2003, 88(8):11911198. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Simmonds MC, Higgins JPT, Stewart LA, Tierney JF, Clarke MJ, Thompson SG: Metaanalysis of individual patient data from randomized trials: a review of methods used in practice.
Clin Trials 2005, 2(3):209217. PubMed Abstract  Publisher Full Text

Dear KB: Iterative generalized least squares for metaanalysis of survival data at multiple times.
Biometrics 1994, 50(4):9891002. PubMed Abstract  Publisher Full Text

Arends LR, Hunink MGM, Stijnen T: Metaanalysis of summary survival curve data.
Stat Med 2008, 27(22):43814396. PubMed Abstract  Publisher Full Text

Fiocco M, Putter H, van Houwelingen JC: Metaanalysis of pairs of survival curves under heterogeneity: a Poisson correlated gammafrailty approach.
Stat Med 2009, 28(30):37823797. PubMed Abstract  Publisher Full Text

Feng S, Wolfe RA, Port FK: Frailty Survival Model Analysis of the National Deceased Donor Kidney Transplant Dataset Using Poisson Variance Structures. [http://www.jstor.org/stable/27590610] webcite
J Am Stat Assoc 2005, 100(471):728735. Publisher Full Text

Williamson PR, Smith CT, Hutton JL, Marson AG: Aggregate data metaanalysis with timetoevent outcomes.
Stat Med 2002, 21(22):33373351. PubMed Abstract  Publisher Full Text

Senn S: The Many Modes of Meta. [http://dij.sagepub.com/content/34/2/535.abstract] webcite
Drug Inf J 2000, 34(2):535549. Publisher Full Text

International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use. Statistical principles for clinical trials
Stat Med 1999, 18:19051942. PubMed Abstract

Legrand C, Ducrocq V, Janssen P, Sylvester R, Duchateau L: A Bayesian approach to jointly estimate centre and treatment by centre heterogeneity in a proportional hazards model.
Stat Med 2005, 24(24):37893804. PubMed Abstract  Publisher Full Text

Siannis F, Barrett JK, Farewell VT, Tierney JF: Onestage parametric metaanalysis of timetoevent outcomes.
Stat Med 2010, 29(29):30303045. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Holford TR: Life Tables with Concomitant Information.
Biometrics 1976, 32(3):587597. PubMed Abstract  Publisher Full Text

Whitehead J: Fitting Cox's regression model to survival data using GLIM.
Appl Stat 1980, 29:268275. Publisher Full Text

Carstensen B: Who needs the Cox model anyway? In Tech rep. Steno Diabetes Center, Denmark; 2004.

Lambert PC, Smith LK, Jones DR, Botha JL: Additive and multiplicative covariate regression models for relative survival incorporating fractional polynomials for timedependent effects.
Stat Med 2005, 24(24):38713885. PubMed Abstract  Publisher Full Text

Clayton D, Hills M: Statistical Methods in Epidemiology. Oxford University Press; 1993.

Higgins JPT, Thompson SG, Spiegelhalter DJ: A reevaluation of randomeffects metaanalysis.
J R Stat Soc A Stat Soc 2009, 172:137159. Publisher Full Text

Riley RD, Higgins JPT, Deeks JJ: Interpretation of random effects metaanalyses.
British Medical Journal 2011, 342:d549. PubMed Abstract  Publisher Full Text

TudurSmith C, Williamson PR, Marson AG: Investigating heterogeneity in an individual patient data metaanalysis of time to event outcomes.
Stat Med 2005, 24(9):13071319. PubMed Abstract  Publisher Full Text

Yamaguchi T, Ohashi Y: Investigating centre effects in a multicentre clinical trial of superficial bladder cancer.
Stat Med 1999, 18(15):19611971. PubMed Abstract  Publisher Full Text

Turner RM, Omar RZ, Yang M, Goldstein H, Thompson SG: A multilevel model framework for metaanalysis of clinical trials with binary outcomes.
Stat Med 2000, 19(24):34173432. PubMed Abstract  Publisher Full Text

StataCorp: Statistical Software: Release 11.0.
2001.

Royston P, Lambert PC: Flexible Parametric Survival Analysis Using Stata: Beyond the Cox Model. Stata Press; 2011.

Bender R, Augustin T, Blettner M: Generating survival times to simulate Cox proportional hazards models.
Stat Med 2005, 24(11):17131723. PubMed Abstract  Publisher Full Text

Thompson J, Palmer T, Morena S: Bayesian analysis in Stata using winBUGS.

Berlin JA, Santanna J, Schmid CH, Szczech LA, Feldman HI: Individual patient versus grouplevel data metaregressions for the investigation of treatment effect modifiers: ecological bias rears its ugly head.
Stat Med 2002, 21(3):371387. PubMed Abstract  Publisher Full Text

Lambert PC, Sutton AJ, Abrams KR, Jones DR: A comparison of summary patientlevel covariates in metaregression with individual patient data metaanalysis.
J Clin Epidemiol 2002, 55:8694. PubMed Abstract  Publisher Full Text

Riley RD, Lambert PC, Staessen JA, Wang J, Gueyffier F, Thijs L, Boutitie F: Metaanalysis of continuous outcomes combining individual patient data and aggregate data.
Stat Med 2008, 27(11):18701893. PubMed Abstract  Publisher Full Text

Riley RD, Steyerberg EW: Metaanalysis of a binary outcome using individual participant data and aggregate data.
Res Synth Methods 2010, 1:219. Publisher Full Text

Gore SM, Pocock SJ, Kerr GR: Regression Models and NonProportional Hazards in the Analysis of Breast Cancer Survival.

Wang J, Staessen JA, Franklin SS, Fagard R, Gueyffier F: Systolic and diastolic blood pressure lowering as determinants of cardiovascular outcome.
Hypertension 2005, 45(5):907913. PubMed Abstract  Publisher Full Text

Lunn DJ, Thomas A, Best N, Spiegelhalter D: WinBUGS  a Bayesian modelling framework: concepts, structure, and extensibility.
Stat Comput 2000, 10:325337. Publisher Full Text

Altman DG, Andersen PK: Calculating the number needed to treat for trials where the outcome is time to an event.
BMJ 1999, 319(7223):14921495. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Lambert PC, Sutton AJ, Burton PR, Abrams KR, Jones DR: How vague is vague? A simulation study of the impact of the use of vague prior distributions in MCMC using WinBUGS.
Stat Med 2005, 24(15):24012428. PubMed Abstract  Publisher Full Text

Lee Y, Nelder JA: Hierarchical Generalized Linear Models. [http://www.jstor.org/stable/2346105] webcite

Michiels S, Baujat B, MahÃ© C, Sargent DJ, Pignon JP: Random effects survival models gave a better understanding of heterogeneity in individual patient data metaanalyses.
J Clin Epidemiol 2005, 58(3):238245. PubMed Abstract  Publisher Full Text

Rondeau V, Michiels S, Liquet B, Pignon JP: Investigating trial and treatment heterogeneity in an individual patient data metaanalysis of survival data by means of the penalized maximum likelihood approach.
Stat Med 2008, 27(11):18941910. PubMed Abstract  Publisher Full Text

Schmidt D, Salahudeen A: The obesitysurvival paradox in hemodialysis patients: why do overweight hemodialysis patients live longer?
Nutr Clin Pract 2007, 22:1115. PubMed Abstract  Publisher Full Text

Gueyffier F, Boissel JP, Pocock S, Boutitie F, Coope J, Cutler J, Ekbom T, Fagard R, Friedman L, Kerlikowske K, Perry M, Prineas R, Schron E: Identification of risk factors in hypertensive patients: contribution of randomized controlled trials through an individual patient database.
Circulation 1999, 100(18):e88e94. PubMed Abstract  Publisher Full Text

Thompson S, Kaptoge S, White I, Wood A, Perry P, Danesh J, Collaboration TERF: Statistical methods for the timetoevent analysis of individual participant data from multiple epidemiological studies. [http://ije.oxfordjournals.org/content/early/2010/05/03/ije.dyq063.abstract] webcite
Int J Epidemiol 2010, 39:13451359. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Ahmed I, Sutton AJ, Riley RD: Assessment of publication bias, selection bias, and unavailable data in metaanalyses using individual participant data: a database survey.
BMJ 2012, 344:d7762. PubMed Abstract  Publisher Full Text
Prepublication history
The prepublication history for this paper can be accessed here: