Centre for Statistics in Medicine, University of Oxford, Oxford, UK

Warwick Clinical Trials Unit, University of Warwick, Coventry, UK

Department of Primary Care & General Practice, University of Birmingham, Birmingham, UK

Hub for Trials Methodology Research and UCL, MRC Clinical Trials Unit, London, UK

Abstract

Background

Multiple imputation (MI) provides an effective approach to handle missing covariate data within prognostic modelling studies, as it can properly account for the missing data uncertainty. The multiply imputed datasets are each analysed using standard prognostic modelling techniques to obtain the estimates of interest. The estimates from each imputed dataset are then combined into one overall estimate and variance, incorporating both the within and between imputation variability. Rubin's rules for combining these multiply imputed estimates are based on asymptotic theory. The resulting combined estimates may be more accurate if the posterior distribution of the population parameter of interest is better approximated by the normal distribution. However, the normality assumption may not be appropriate for all the parameters of interest when analysing prognostic modelling studies, such as predicted survival probabilities and model performance measures.

Methods

Guidelines for combining the estimates of interest when analysing prognostic modelling studies are provided. A literature review is performed to identify current practice for combining such estimates in prognostic modelling studies.

Results

Methods for combining all reported estimates after MI were not well reported in the current literature. Rubin's rules without applying any transformations were the standard approach used, when any method was stated.

Conclusion

The proposed simple guidelines for combining estimates after MI may lead to a wider and more appropriate use of MI in future prognostic modelling studies.

Background

Prognostic models play an important role in the clinical decision making process as they help clinicians to determine the most appropriate management of patients. A good prognostic model can provide an insight into the relationship between the outcome of patients and known patient and disease characteristics

Missing covariate data and censored outcomes are unfortunately common occurrences in prognostic modelling studies _{1},..., _{m}. Provided that the imputation procedure is proper

Example techniques and parameters of interest in prognostic modelling studies and the rules currently available for combining estimates after MI are summarised. This paper will then provide guidelines on how estimates of the parameters of interest in prognostic modelling studies can be combined after performing MI. A review of the current practice for combining estimates after MI within published prognostic modelling studies is provided.

Methods

Prognostic models

Prognostic models, focusing on time to event data that may be censored, are often constructed using survival analysis techniques such as the Cox proportional hazards model or parametric survival models. Ideally, pre-specification of the covariates prior to the modelling process, and hence fitting the full model results in more reliable and less biased prognostic models than data derived models based on statistical significance testing

The parameters of interest in prognostic modelling are summarised in Table

Parameter of interest in prognostic modelling studies and ways to combine estimates after MI

**Parameters**

**Possible methods for combining estimates of parameters after MI***

**Covariate distribution**

Mean Value

Rubin's rules

Standard Deviation

Rubin's rules

Correlation

Rubin's rules after Fisher's Z transformation

**Model parameters**

Regression coefficient

Rubin's rules

Hazard ratio

Rubin's rules after logarithmic transformation

Prognostic Index/linear predictor per patient

Rubin's rules

**Model fit and performance**

Testing significance of individual covariate in model

Rubin's rules using a Wald test for a single estimates (Table 2(A))

Testing significance of all fitted covariates in model

Rubin's rules using a Wald test for multivariate estimates (Table 2(B))

Likelihood ratio ^{2 }test statistic

Rules for combining likelihood ratio statistics if parametric model (Table 2(D)) or ^{2 }statistics if Cox model (Table 2(C))

Proportion of variance explained (e.g. R^{2 }statistics)

Robust methods

Discrimination (c-index)

Robust methods

Prognostic Separation D statistic

Rubin's rules

Calibration (Shrinkage estimate)

Robust methods

**Prediction**

Survival probabilities

Rubin's rules after complementary log-log transformation

Percentiles of a survival distribution

Rubin's rules after logarithmic transformation

* Reflect the authors' experiences and current evidence.

The likelihood ratio chi-square (^{2}) statistic tests the hypothesis of no difference between the null model given a specified distribution and the fitted prognostic model with ^{2 }

Rules for MI inference

The rules developed by Rubin ^{2 }statistics ^{2 }statistics

Combining parameter estimates

For a single population parameter of interest,

These procedures for combining a single quantity of interest can be extended in matrix form to combine **U **is the associated

Hypothesis testing

Significance level based on a single combined estimate

A significance level for testing the null hypothesis that a single combined estimate equals a specific value, _{0}: H_{0}: _{0 }can be obtained using a Wald test by comparing the test statistic ^{-1})^{2 }degrees of freedom, where

Summary of significance tests for combining different estimates from

**Estimate**

**Test statistic**

**Degrees of freedom (df)**

**Relative increase in variance ( r)**

**A) Scalar **

_{1, v}

_{0}: _{0}

^{-1})^{2}

**B) Multivariate **

H_{0}:**Q = Q**_{0},

where

**C)**

^{2 }
**statistics**

_{1},..., _{m}

^{2 }tests

**D) Likelihood Ratio **
^{2 }
**statistics**

_{L1},..., _{Lm}

**KEY**:

_{j}, ^{2 }statistics associated with testing the null hypothesis _{o }: _{o }on each imputed dataset, such that the significance level for the ^{th }imputed dataset is _{j}}, where ^{2 }value with

^{2 }statistics.

_{L1},..., _{Lm}, evaluated using the average MI parameter estimates and the average of the estimates from a model fitted subject to the null hypothesis.

Significance level based on combined multivariate estimates

In the context of prognostic modelling, it is useful to test the global null hypothesis that all **Q**_{o }is provided in Table

Significance level based on combining χ^{2 }statistics

An alternative to testing the multivariate point estimates is the method for combining ^{2 }statistics, associated with testing a null hypothesis of _{o }: _{o}, e.g. a regression coefficient is zero or all regression coefficients are zero (Table ^{2 }statistics are available. This approach is deficient compared to the method for combining multivariate estimates and should be used only as a guide, especially when there are a large number of parameters compared to only a small number of imputations ^{2 }statistics and thus there is a consequent loss of power _{2 }in Table ^{2 }statistic is based on

Significance level based on combining likelihood ratio χ^{2 }statistics

The method for combining the likelihood ratio ^{2 }statistics ^{2 }statistics. The obtained significance level should be asymptotically equivalent to that based on the combined multivariate estimates

The likelihood function needs to be fully specified in order to calculate the likelihood ratio statistics determined at the average of the parameter estimates over the

Guidelines for combining estimates of interest in prognostic studies

The procedures for combining multiply imputed estimates that are of particular interest in prognostic modelling are discussed in the following subsections. It is assumed that the full prognostic model is fitted and its performance evaluated within each imputed dataset and the required estimates (as given in Table

Combining estimates using Rubin's rules

The sample mean of a covariate, standard deviation, regression coefficients, individual prognostic index and the prognostic separation estimates can all be combined using Rubin's rules for single estimates. It is important to emphasise that the variance associated with a sample mean of a covariate is the sample variance divided by the number of observations and hence not just its sample variance

The likelihood ratio statistic for testing the hypothesis of no difference between two nested prognostic models from each imputed dataset can be combined using the inferences for likelihood ratio statistics (Table ^{2 }statistics (Table

Combining estimates using Rubin's rules after suitable transformation

The correlation coefficient, hazard ratios, predicted survival probabilities and percentiles of the survival distribution can all be combined using Rubin's rules after suitable transformations to improve normality. The obtained combined estimates should be back transformed onto their original scale prior to analysis.

Fisher's z transformation

The complementary log-log transformation for the predicted survival probability at particular time-points gives a possible range of (-∞, +∞) instead of the survivorship estimate being bounded by zero and one, and is often used to determine reasonable confidence intervals ^{th }percentile of a survival distribution is the logarithmic transformation, as this gives a possible range of (-∞, +∞) instead of being bounded by zero and infinity and is generally used to obtain a confidence interval

Combining model performance measures where the normality assumption is uncertain and variance estimates are generally unavailable

When considering model performance measures, the imputation model should be more general than the prognostic models being investigated, as the performance measures are more sensitive to the choice of imputation model and therefore may produce more bias than seen in the regression parameter estimates from the prognostic model. If one is willing to accept the large sample approximation to normality for the proportion of variance explained measures, e.g. Nagelkerke's R^{2 }statistic ^{2 }statistics and by Clark and Altman

Methods for literature review

A literature search was performed within the PubMed (National Library of Medicine) and Web of Science^{® }bibliographic software of all articles published before June 2008 that used multiple imputation techniques and a survival analysis to obtain a prognostic model. Methodological papers were excluded. The aim of the review was to identify how estimates of the parameters of interest in prognostic modelling studies have been combined after performing MI in the published literature.

Results

Sixteen non-methodological articles were identified. The MI techniques reported were varied with no overall consensus on technique or statistical software. The number of imputations ranged from five to 10000, with the majority of studies using five or ten imputations. The amount of missingness reported also varied from studies with relatively little missing data

In seven articles, no mention of how the estimates of interest were combined after MI was given. Clark et al.

Discussion

With the advances in computer technologies and software, MI is becoming more accessible. MI has been performed prior to the analysis of several prognostic modelling studies, e.g.

This paper has suggested guidelines for combining multiply imputed estimates that are of interest when a survival model is fitted to a dataset and suitable performance measures and predicted survival probabilities are required for summarising the model (Table ^{2}, from a linear regression model fitted to normally distributed data can be considered as a squared correlation coefficient and can be transformed by taking the square root and then applying Fisher's Z transformation as for the correlation coefficient. However whether this approach would apply to R^{2 }measures from a survival regression model that may be affected by censored observations as arises in survival analysis is debatable and therefore robust methods are recommended here.

In this paper, model performance measures were calculated within each imputed dataset using the constructed prognostic model for that dataset and then combined to give an overall multiply imputed measure. The performance of a prognostic model derived using a development sample will also need to be externally validated using an independent dataset

Conclusion

The review of current practice highlighted deficiencies in the reporting of how the multiply imputed estimates given in the published articles were obtained. Thus, it is recommended that future studies include a more thorough description of the methods used to combine all estimates after MI.

The ability to use MI methods that are readily available in standard statistical software and apply simple rules to combine the estimates of interest rather than requiring problem specific programmes makes MI more accessible to practising statistician. We hope that this may lead to a more widespread and appropriate use of MI in future prognostic modelling studies and improved comparability of the obtained estimates between studies.

Abbreviations

MI: multiple imputation; m: number of imputations; SE: standard error.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

All authors made substantial contributions to the ideas presented in this manuscript. AM participated in the conception of this research, the methodological content, the design, coordination and analysis of the literature review and drafted the manuscript. DGA was involved in the conception and design of the study and helped in the writing of the manuscript. RLH participated in the design and methodological content of this study and in the revision of the manuscript. PR contributed to the methodological content of this research and the revision of the manuscript. All authors have read and approved the final manuscript.

Acknowledgements

Andrea Marshall (nee Burton) was supported by a Cancer Research UK project grant. DGA is supported by Cancer Research UK.

Pre-publication history

The pre-publication history for this paper can be accessed here: