Comparison of approaches to estimate confidence intervals of post-test probabilities of diagnostic test results in a nested case-control study
1 Julius Center for Health Sciences and Primary Care And Division of Anesthesiology, Intensive Care Care and Emergency Medicine, University Medical Center Utrecht, PO box 85500, 3508 GA, Utrecht, The Netherlands
2 Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, PO box 85500, 3508 GA, Utrecht, The Netherlands
3 Department of Epidemiology, Biostatistics and Health Technology Assessment, Radboud University Nijmegen Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
BMC Medical Research Methodology 2012, 12:166 doi:10.1186/1471-2288-12-166Published: 31 October 2012
Nested case–control studies become increasingly popular as they can be very efficient for quantifying the diagnostic accuracy of costly or invasive tests or (bio)markers. However, they do not allow for direct estimation of the test’s predictive values or post-test probabilities, let alone for their confidence intervals (CIs). Correct estimates of the predictive values itself can easily be obtained using a simple correction by the (inverse) sampling fractions of the cases and controls. But using this correction to estimate the corresponding standard error (SE), falsely increases the number of patients that are actually studied, yielding too small CIs. We compared different approaches for estimating the SE and thus CI of predictive values or post-test probabilities of diagnostic test results in a nested case–control study.
We created datasets based on a large, previously published diagnostic study on 2 different tests (D-dimer test and calf difference test) with a nested case–control design. We compared six different approaches; the approaches were: 1. the standard formula for the SE of a proportion, 2. adaptation of the standard formula with the sampling fraction, 3. A bootstrap procedure, 4. A approach, which uses the sensitivity, the specificity and the prevalence, 5. Weighted logistic regression, and 6. Approach 4 on the log odds scale. The approaches were compared with respect to coverage of the CI and CI-width.
The bootstrap procedure (approach 3) showed good coverage and relatively small CI widths. Approaches 4 and 6 showed some undercoverage, particularly for the D-dimer test with frequent positive results (positive results around 70%). Approaches 1, 2 and 5 showed clear overcoverage at low prevalences of 0.05 and 0.1 in the cohorts for all case–control ratios.
The results from our study suggest that a bootstrap procedure is necessary to assess the confidence interval for the predictive values or post-test probabilities of diagnostic tests results in studies using a nested case–control design.