An assessment of the relationship between clinical utility and predictive ability measures and the impact of mean risk in the population
1 Sydney School of Public Health, The University of Sydney, Sydney, Australia
2 The Screening and Test Evaluation Program, The University of Sydney, Sydney, Australia
3 Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Centre (AMC), University of Amsterdam, Amsterdam, The Netherlands
BMC Medical Research Methodology 2014, 14:86 doi:10.1186/1471-2288-14-86Published: 3 July 2014
Measures of clinical utility (net benefit and event free life years) have been recommended in the assessment of a new predictor in a risk prediction model. However, it is not clear how they relate to the measures of predictive ability and reclassification, such as the c-statistic and Net Reclassification Improvement (NRI), or how these measures are affected by differences in mean risk between populations when a fixed cutpoint to define high risk is assumed.
We examined the relationship between measures of clinical utility (net benefit, event free life years) and predictive ability (c-statistic, binary c-statistic, continuous NRI(0), NRI with two cutpoints, binary NRI) using simulated data and the Framingham dataset.
In the analysis of simulated data, the addition of a new predictor tended to result in more people being treated when the mean risk was less than the cutpoint, and fewer people being treated for mean risks beyond the cutpoint. The reclassification and clinical utility measures showed similar relationships with mean risk when the mean risk was less than the cutpoint and the baseline model was not strong. However, when the mean risk was greater than the cutpoint, or the baseline model was strong, the reclassification and clinical utility measures diverged in their relationship with mean risk.
Although the risk of CVD was lower for women compared to men in the Framingham dataset, the measures of predictive ability, reclassification and clinical utility were both larger for women. The difference in these results was, in part, due to the larger hazard ratio associated with the additional risk predictor (systolic blood pressure) for women.
Measures such as the c-statistic and the measures of reclassification do not capture the consequences of implementing different prediction models. We do not recommend their use in evaluating which new predictors may be clinically useful in a particular population. We recommend that a measure such as net benefit or EFLY is calculated and, where appropriate, the measure is weighted to account for differences in the distribution of risks between the study population and the population in which the new predictors will be implemented.