TY - JOUR
T1 - Rainforest
T2 - A random forest approach to predict treatment benefit in data from (failed) clinical drug trials
AU - Ubels, Joske
AU - Schaefers, Tilman
AU - Punt, Cornelis
AU - Guchelaar, Henk Jan
AU - de Ridder, Jeroen
N1 - Publisher Copyright:
© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: [email protected]
PY - 2020/12/1
Y1 - 2020/12/1
N2 - Motivation: When phase III clinical drug trials fail their endpoint, enormous resources are wasted. Moreover, even if a clinical trial demonstrates a significant benefit, the observed effects are often small and may not outweigh the side effects of the drug. Therefore, there is a great clinical need for methods to identify genetic markers that can identify subgroups of patients which are likely to benefit from treatment as this may (i) rescue failed clinical trials and/or (ii) identify subgroups of patients which benefit more than the population as a whole. When single genetic biomarkers cannot be found, machine learning approaches that find multivariate signatures are required. For single nucleotide polymorphism (SNP) profiles, this is extremely challenging owing to the high dimensionality of the data. Here, we introduce RAINFOREST (tReAtment benefIt prediction using raNdom FOREST), which can predict treatment benefit from patient SNP profiles obtained in a clinical trial setting. Results: We demonstrate the performance of RAINFOREST on the CAIRO2 dataset, a phase III clinical trial which tested the addition of cetuximab treatment for metastatic colorectal cancer and concluded there was no benefit. However, we find that RAINFOREST is able to identify a subgroup comprising 27.7% of the patients that do benefit, with a hazard ratio of 0.69 (P ¼ 0.04) in favor of cetuximab. The method is not specific to colorectal cancer and could aid in reanalysis of clinical trial data and provide a more personalized approach to cancer treatment, also when there is no clear link between a single variant and treatment benefit.
AB - Motivation: When phase III clinical drug trials fail their endpoint, enormous resources are wasted. Moreover, even if a clinical trial demonstrates a significant benefit, the observed effects are often small and may not outweigh the side effects of the drug. Therefore, there is a great clinical need for methods to identify genetic markers that can identify subgroups of patients which are likely to benefit from treatment as this may (i) rescue failed clinical trials and/or (ii) identify subgroups of patients which benefit more than the population as a whole. When single genetic biomarkers cannot be found, machine learning approaches that find multivariate signatures are required. For single nucleotide polymorphism (SNP) profiles, this is extremely challenging owing to the high dimensionality of the data. Here, we introduce RAINFOREST (tReAtment benefIt prediction using raNdom FOREST), which can predict treatment benefit from patient SNP profiles obtained in a clinical trial setting. Results: We demonstrate the performance of RAINFOREST on the CAIRO2 dataset, a phase III clinical trial which tested the addition of cetuximab treatment for metastatic colorectal cancer and concluded there was no benefit. However, we find that RAINFOREST is able to identify a subgroup comprising 27.7% of the patients that do benefit, with a hazard ratio of 0.69 (P ¼ 0.04) in favor of cetuximab. The method is not specific to colorectal cancer and could aid in reanalysis of clinical trial data and provide a more personalized approach to cancer treatment, also when there is no clear link between a single variant and treatment benefit.
UR - http://www.scopus.com/inward/record.url?scp=85099187564&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btaa799
DO - 10.1093/bioinformatics/btaa799
M3 - Article
C2 - 33381829
AN - SCOPUS:85099187564
SN - 1367-4803
VL - 36
SP - I601-I609
JO - Bioinformatics
JF - Bioinformatics
IS - S2
ER -