Reporting performance of prognostic models in cancer: a review
1 Centre for Statistics in Medicine, Wolfson College Annexe, University of Oxford, Linton Road, Oxford, OX2 6UD, UK
2 MRC Clinical Trials Unit, 222 Euston Road, London NW1 2DA, UK
BMC Medicine 2010, 8:21 doi:10.1186/1741-7015-8-21Published: 30 March 2010
Appropriate choice and use of prognostic models in clinical practice require the use of good methods for both model development, and for developing prognostic indices and risk groups from the models. In order to assess reliability and generalizability for use, models need to have been validated and measures of model performance reported. We reviewed published articles to assess the methods and reporting used to develop and evaluate performance of prognostic indices and risk groups from prognostic models.
We developed a systematic search string and identified articles from PubMed. Forty-seven articles were included that satisfied the following inclusion criteria: published in 2005; aiming to predict patient outcome; presenting new prognostic models in cancer with outcome time to an event and including a combination of at least two separate variables; and analysing data using multivariable analysis suitable for time to event data.
In 47 studies, Cox models were used in 94% (44), but the coefficients or hazard ratios for the variables in the final model were reported in only 72% (34). The reproducibility of the derived model was assessed in only 11% (5) of the articles. A prognostic index was developed from the model in 81% (38) of the articles, but researchers derived the prognostic index from the final prognostic model in only 34% (13) of the studies; different coefficients or variables from those in the final model were used in 50% (19) of models and the methods used were unclear in 16% (6) of the articles. Methods used to derive prognostic groups were also poor, with researchers not reporting the methods used in 39% (14 of 36) of the studies and data derived methods likely to bias estimates of differences between risk groups being used in 28% (10) of the studies. Validation of their models was reported in only 34% (16) of the studies. In 15 studies validation used data from the same population and in five studies from a different population. Including reports of validation with external data from publications up to four years following model development, external validation was attempted for only 21% (10) of models. Insufficient information was provided on the performance of models in terms of discrimination and calibration.
Many published prognostic models have been developed using poor methods and many with poor reporting, both of which compromise the reliability and clinical relevance of models, prognostic indices and risk groups derived from them.