TY - JOUR
T1 - Reliability of data-driven versus expert-driven composite indicators in between-hospital comparisons on quality of oesophagogastric cancer surgery
T2 - a population-based retrospective cohort study
AU - Van der Linde, Margrietha
AU - Eijkenaar, Frank
AU - Visser, Maurits R.
AU - Wijnhoven, Bas PL
AU - Lingsma, Hester F.
AU - Oude Voshaar, Martijn AH
N1 - Publisher Copyright:
© Author(s) (or their employer(s)) 2025. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ Group.
PY - 2025/11/5
Y1 - 2025/11/5
N2 - Objective To construct a data-driven composite from (a subset of) currently used quality indicators for oesophagogastric cancer surgery and to evaluate whether this approach enhances the reliability of between-hospital comparisons on outcome relative to the expert-driven composite indicator ‘textbook outcome (TO)’. Design In this retrospective cohort study, we applied Item Response Theory (IRT) to construct a data-driven continuous composite indicator reflecting a single latent variable—the quality of surgical care—and estimated latent variable scores for all individual patients. Reliability was compared between the expert-driven (TO) and datadriven (IRT) composite indicators. Setting All Dutch hospitals providing oesophagogastric cancer surgery. Participants All patients who underwent oesophagectomy (n=3588) or gastrectomy (n=1782) between 2018 and 2022 as registered in the Dutch Upper GI Cancer Audit (DUCA). Primary and secondary outcome measures We evaluated the reliability of between-hospital comparisons using ‘rankability’, which quantifies the proportion of observed variation in indicator scores between hospitals not attributable to chance. Results Seven out of 15 quality indicators were included in the IRT composite indicator. Most of the patients were assigned the artificial maximum of the continuous quality score (ie, ceiling effect), resulting in similar average hospital scores. Relative to TO, rankability increased when using the IRT composite for oesophagectomy (57% vs 41%) but declined for gastrectomy (38% vs 47%). Conclusions The selected seven quality indicators for oesophageal and gastric cancer surgery represent a single latent variable but are not yet optimal for differentiating surgical care quality due to ceiling effects. Despite using fewer indicators, the continuous IRT score showed a promising increase in rankability for oesophagectomy, suggesting that data-driven composite indicators may enhance hospital benchmarking reliability.
AB - Objective To construct a data-driven composite from (a subset of) currently used quality indicators for oesophagogastric cancer surgery and to evaluate whether this approach enhances the reliability of between-hospital comparisons on outcome relative to the expert-driven composite indicator ‘textbook outcome (TO)’. Design In this retrospective cohort study, we applied Item Response Theory (IRT) to construct a data-driven continuous composite indicator reflecting a single latent variable—the quality of surgical care—and estimated latent variable scores for all individual patients. Reliability was compared between the expert-driven (TO) and datadriven (IRT) composite indicators. Setting All Dutch hospitals providing oesophagogastric cancer surgery. Participants All patients who underwent oesophagectomy (n=3588) or gastrectomy (n=1782) between 2018 and 2022 as registered in the Dutch Upper GI Cancer Audit (DUCA). Primary and secondary outcome measures We evaluated the reliability of between-hospital comparisons using ‘rankability’, which quantifies the proportion of observed variation in indicator scores between hospitals not attributable to chance. Results Seven out of 15 quality indicators were included in the IRT composite indicator. Most of the patients were assigned the artificial maximum of the continuous quality score (ie, ceiling effect), resulting in similar average hospital scores. Relative to TO, rankability increased when using the IRT composite for oesophagectomy (57% vs 41%) but declined for gastrectomy (38% vs 47%). Conclusions The selected seven quality indicators for oesophageal and gastric cancer surgery represent a single latent variable but are not yet optimal for differentiating surgical care quality due to ceiling effects. Despite using fewer indicators, the continuous IRT score showed a promising increase in rankability for oesophagectomy, suggesting that data-driven composite indicators may enhance hospital benchmarking reliability.
KW - Clinical audit
KW - Health policy
KW - Quality Improvement
KW - Quality in health care
UR - https://www.scopus.com/pages/publications/105021069590
U2 - 10.1136/bmjopen-2025-104832
DO - 10.1136/bmjopen-2025-104832
M3 - Article
C2 - 41198208
AN - SCOPUS:105021069590
SN - 2044-6055
VL - 15
JO - BMJ Open
JF - BMJ Open
IS - 11
M1 - e104832
ER -