TY - JOUR
T1 - Measuring the performance of prediction models to personalize treatment choice
AU - Efthimiou, Orestis
AU - Hoogland, Jeroen
AU - Debray, Thomas P.A.
AU - Seo, Michael
AU - Furukawa, Toshiaki A.
AU - Egger, Matthias
AU - White, Ian R.
N1 - Publisher Copyright:
© 2023 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
PY - 2023/4/15
Y1 - 2023/4/15
N2 - When data are available from individual patients receiving either a treatment or a control intervention in a randomized trial, various statistical and machine learning methods can be used to develop models for predicting future outcomes under the two conditions, and thus to predict treatment effect at the patient level. These predictions can subsequently guide personalized treatment choices. Although several methods for validating prediction models are available, little attention has been given to measuring the performance of predictions of personalized treatment effect. In this article, we propose a range of measures that can be used to this end. We start by defining two dimensions of model accuracy for treatment effects, for a single outcome: discrimination for benefit and calibration for benefit. We then amalgamate these two dimensions into an additional concept, decision accuracy, which quantifies the model's ability to identify patients for whom the benefit from treatment exceeds a given threshold. Subsequently, we propose a series of performance measures related to these dimensions and discuss estimating procedures, focusing on randomized data. Our methods are applicable for continuous or binary outcomes, for any type of prediction model, as long as it uses baseline covariates to predict outcomes under treatment and control. We illustrate all methods using two simulated datasets and a real dataset from a trial in depression. We implement all methods in the R package predieval. Results suggest that the proposed measures can be useful in evaluating and comparing the performance of competing models in predicting individualized treatment effect.
AB - When data are available from individual patients receiving either a treatment or a control intervention in a randomized trial, various statistical and machine learning methods can be used to develop models for predicting future outcomes under the two conditions, and thus to predict treatment effect at the patient level. These predictions can subsequently guide personalized treatment choices. Although several methods for validating prediction models are available, little attention has been given to measuring the performance of predictions of personalized treatment effect. In this article, we propose a range of measures that can be used to this end. We start by defining two dimensions of model accuracy for treatment effects, for a single outcome: discrimination for benefit and calibration for benefit. We then amalgamate these two dimensions into an additional concept, decision accuracy, which quantifies the model's ability to identify patients for whom the benefit from treatment exceeds a given threshold. Subsequently, we propose a series of performance measures related to these dimensions and discuss estimating procedures, focusing on randomized data. Our methods are applicable for continuous or binary outcomes, for any type of prediction model, as long as it uses baseline covariates to predict outcomes under treatment and control. We illustrate all methods using two simulated datasets and a real dataset from a trial in depression. We implement all methods in the R package predieval. Results suggest that the proposed measures can be useful in evaluating and comparing the performance of competing models in predicting individualized treatment effect.
KW - heterogeneous treatment effects
KW - personalized medicine
KW - prediction modelling
UR - http://www.scopus.com/inward/record.url?scp=85147289187&partnerID=8YFLogxK
U2 - 10.1002/sim.9665
DO - 10.1002/sim.9665
M3 - Article
C2 - 36700492
AN - SCOPUS:85147289187
SN - 0277-6715
VL - 42
SP - 1188
EP - 1206
JO - Statistics in Medicine
JF - Statistics in Medicine
IS - 8
M1 - doi.org/10.1002/sim.9665
ER -