Abstract
It is widely recommended that any developed-diagnostic or prognostic-prediction model is externally validated in terms of its predictive performance measured by calibration and discrimination. When multiple validations have been performed, a systematic review followed by a formal meta-analysis helps to summarize overall performance across multiple settings, and reveals under which circumstances the model performs suboptimal (alternative poorer) and may need adjustment. We discuss how to undertake meta-analysis of the performance of prediction models with either a binary or a time-to-event outcome. We address how to deal with incomplete availability of study-specific results (performance estimates and their precision), and how to produce summary estimates of the c-statistic, the observed:expected ratio and the calibration slope. Furthermore, we discuss the implementation of frequentist and Bayesian meta-analysis methods, and propose novel empirically-based prior distributions to improve estimation of between-study heterogeneity in small samples. Finally, we illustrate all methods using two examples: meta-analysis of the predictive performance of EuroSCORE II and of the Framingham Risk Score. All examples and meta-analysis models have been implemented in our newly developed R package "metamisc".
Original language | English |
---|---|
Pages (from-to) | 2768-2786 |
Number of pages | 19 |
Journal | Statistical Methods in Medical Research |
Volume | 28 |
Issue number | 9 |
Early online date | 1 Jan 2018 |
DOIs | |
Publication status | Published - Sept 2019 |
Keywords
- prediction
- discrimination
- evidence synthesis
- systematic review
- calibration
- prognosis
- validation
- Meta-analysis
- aggregate data