Practical guidance for validating the predictive performance in the presence of missing data: a guide for the clinical researcher

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Background and Objectives Prediction models are widely used across all fields of medicine as tools to support patient counseling and guide treatment decisions. A key step before any prediction model can be implemented in clinical practice is internal validation, for which principles are well described in the literature. However, the application of these principles is challenging when complex models are used or when missing values are present in the predictor variables. Approaches for internal validation and handling of missing data often result in a multitude of datasets, such as multiple bootstrapped samples across multiple imputations. Analyzing such cross-multiplied datasets in a streamlined manner is not straightforward. Methods, Results, Conclusion This paper provides practical guidance and a structured R workflow to support clinical researchers in combining internal validation and imputation methods when building reliable prediction models.

Original languageEnglish
Article number112159
JournalJournal of Clinical Epidemiology
Volume192
Early online date21 Jan 2026
DOIs
Publication statusE-pub ahead of print - 21 Jan 2026

Keywords

  • Imputation
  • Internal validation
  • Missing data
  • Model validation
  • Penalization
  • Prediction
  • Shrinkage

Cite this