Table 3

Recommendations for improved research when developing and validating risk prediction models from multiple studies
Area for improvement Recommendations
Rationale and initiation • Produce a protocol for the project, detailing rationale, conduct and statistical analysis and reference this
Obtaining IPD • Report how the primary study authors were approached for their IPD
• Report strategy used to identify relevant studies, e.g. literature review/collaborative group
• If literature review performed, then report search strategy, including keywords and databases used
• Provide a flowchart showing the search strategy, classification of identified articles, and retrieval of IPD from relevant studies
• Report any prior sample size considerations used, such as the number of IPD studies deemed necessary and the number of patients and events required. If no sample size requirements were considered, report this also
Details of IPD • Report the number of patients and events for each study used in model development and/or validation
• Report the missing data for each study (e.g. whether predictors were missing entirely, or how many patients had predictor values missing), and whether some patients or studies were entirely excluded for this reason
• Detail the reasons why IPD was unavailable in some desired studies (if applicable), and report the number of patients and events from these studies
• If any studies were excluded after IPD was obtained, provide the number of studies excluded and explain why they were removed (e.g. missing predictors, different outcome definition, different methods of measurement)
• Compare and report the quality of studies for which IPD was obtained
Statistical methods for model development • Account for clustering of patients within studies, for example by allowing for a separate intercept per study
• Report the selection criteria and procedure used to decide which predictors are included in the final model
• Assess and report any between study heterogeneity in the effects of included predictors
• If large heterogeneity does exist in particular predictors, then try to reduce it by including more predictors or simply focus on including homogenous or weakly heterogeneous factors
• Where possible model continuous predictors on their continuous scale, unless it is important to categorise with good clinical or statistical reason
• Report the final developed model in original format with alpha (baseline risk) and beta estimates, so that others can ascertain how apply the model in practice
• Detail how missing patient-level data and missing study-level factors were dealt with in the analysis
Model validation and implementation • Validate the model that has been developed using internal-external cross-validation; we tentatively suggest at least 4 studies are required for this approach however.
• Explain the choice of intercept (baseline hazard) to be used when implementing the model in the excluded study
• Report validation statistics for each study excluded in the internal-external cross validation method
• Report clearly whether there is evidence the model performs consistently well during the internal-external validation
• If it performs consistently well, clearly report the final overall prediction model to be used in practice, and emphasise again how the intercept should be chosen upon application
    - If it does not perform consistently well, clearly flag those populations for which the model cannot be applied and draw attention to the model’s lack of generalisability
Impact of missing IPD studies • If possible, compare the populations of those studies not providing IPD to those studies providing IPD, to be able to understand whether the developed model may need further generalisation in such populations in the future

Ahmed et al.

Ahmed et al. BMC Medical Research Methodology 2014 14:3   doi:10.1186/1471-2288-14-3

Open Data