Reliability of data-driven versus expert-driven composite indicators in between-hospital comparisons on quality of oesophagogastric cancer surgery: a population-based retrospective cohort study

Margrietha Van der Linde*, Frank Eijkenaar, Maurits R. Visser, Bas PL Wijnhoven, Hester F. Lingsma, Martijn AH Oude Voshaar

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Objective To construct a data-driven composite from (a subset of) currently used quality indicators for oesophagogastric cancer surgery and to evaluate whether this approach enhances the reliability of between-hospital comparisons on outcome relative to the expert-driven composite indicator ‘textbook outcome (TO)’. Design In this retrospective cohort study, we applied Item Response Theory (IRT) to construct a data-driven continuous composite indicator reflecting a single latent variable—the quality of surgical care—and estimated latent variable scores for all individual patients. Reliability was compared between the expert-driven (TO) and datadriven (IRT) composite indicators. Setting All Dutch hospitals providing oesophagogastric cancer surgery. Participants All patients who underwent oesophagectomy (n=3588) or gastrectomy (n=1782) between 2018 and 2022 as registered in the Dutch Upper GI Cancer Audit (DUCA). Primary and secondary outcome measures We evaluated the reliability of between-hospital comparisons using ‘rankability’, which quantifies the proportion of observed variation in indicator scores between hospitals not attributable to chance. Results Seven out of 15 quality indicators were included in the IRT composite indicator. Most of the patients were assigned the artificial maximum of the continuous quality score (ie, ceiling effect), resulting in similar average hospital scores. Relative to TO, rankability increased when using the IRT composite for oesophagectomy (57% vs 41%) but declined for gastrectomy (38% vs 47%). Conclusions The selected seven quality indicators for oesophageal and gastric cancer surgery represent a single latent variable but are not yet optimal for differentiating surgical care quality due to ceiling effects. Despite using fewer indicators, the continuous IRT score showed a promising increase in rankability for oesophagectomy, suggesting that data-driven composite indicators may enhance hospital benchmarking reliability.

Original languageEnglish
Article numbere104832
JournalBMJ Open
Volume15
Issue number11
DOIs
Publication statusPublished - 5 Nov 2025

Keywords

  • Clinical audit
  • Health policy
  • Quality Improvement
  • Quality in health care

Fingerprint

Dive into the research topics of 'Reliability of data-driven versus expert-driven composite indicators in between-hospital comparisons on quality of oesophagogastric cancer surgery: a population-based retrospective cohort study'. Together they form a unique fingerprint.

Cite this