TY - JOUR
T1 - plasmaCHORD
T2 - A Machine Learning Approach to Distinguish Clonal Hematopoiesis-Derived Variants in Liquid Biopsies from Patients with Solid Tumors
AU - Canzoniero, Jenna V.
AU - Rabizadeh, Daniel
AU - Ziakas, Ilias
AU - Wehr, Jaime
AU - Balan, Archana
AU - Jamali, Amna
AU - Landon, Blair V.
AU - Sivapalan, Lavanya
AU - Scott, Susan
AU - Pereira, Gavin
AU - Lam, Vincent K.
AU - Hann, Christine L.
AU - Lovly, Christine M.
AU - Tao, Jessica
AU - Forde, Patrick M.
AU - Murray, Joseph C.
AU - Sausen, Mark
AU - Meijer, Gerrit A.
AU - Vink, Geraldine R.
AU - Fijneman, Remond J.A.
AU - Velculescu, Victor E.
AU - Phallen, Jillian
AU - Scharpf, Robert B.
AU - Anagnostou, Valsamo
N1 - Publisher Copyright:
©2026 The Authors; Published by the American Association for Cancer Research.
PY - 2026/5/1
Y1 - 2026/5/1
N2 - PURPOSE: Targeted next-generation sequencing (NGS) of cell-free DNA (cfDNA) enables comprehensive molecular profiling and can guide the selection of genotype-targeted therapies. However, the detection of variants derived from clonal hematopoiesis (CH) is a significant confounder in liquid biopsies. EXPERIMENTAL DESIGN: Using a training cohort of 426 variants identified in cfDNA NGS from 225 patients with stage I to IV solid tumors, we developed plasma Clonal Hematopoiesis ORigin Detection (plasmaCHORD), a machine learning model that includes fragment-, variant-, and patient-level features to distinguish between tumor and CH origin for each variant detected by liquid biopsies. Model performance was assessed by comparison with the reference origin for each plasma variant determined from matched white blood cell and tumor NGS. Following the locking of the model parameters, we applied plasmaCHORD to an independent validation cohort of 1,418 plasma variants detected in 114 patients with metastatic cancers, as well as to cfDNA NGS from patients enrolled in a prospective clinical trial (NCT05585684). RESULTS: plasmaCHORD predicted tumor origin versus CH origin in the training set with high accuracy (AUC = 0.94). In the independent validation cohort, the locked model maintained similar overall accuracy (AUC = 0.9) and demonstrated significant improvement in accuracy for clinically significant genes. When applied to clinically challenging cases in the context of a precision oncology clinical trial, plasmaCHORD precisely determined variant origin, preventing mismatches with genotype-targeted therapies. CONCLUSIONS: plasmaCHORD, a multifeature machine learning model, can significantly enhance the ability to identify bona fide tumor variants in routine plasma-only NGS, addressing a critical need for implementing liquid biopsy-guided therapy by minimizing misinterpretation caused by CH.
AB - PURPOSE: Targeted next-generation sequencing (NGS) of cell-free DNA (cfDNA) enables comprehensive molecular profiling and can guide the selection of genotype-targeted therapies. However, the detection of variants derived from clonal hematopoiesis (CH) is a significant confounder in liquid biopsies. EXPERIMENTAL DESIGN: Using a training cohort of 426 variants identified in cfDNA NGS from 225 patients with stage I to IV solid tumors, we developed plasma Clonal Hematopoiesis ORigin Detection (plasmaCHORD), a machine learning model that includes fragment-, variant-, and patient-level features to distinguish between tumor and CH origin for each variant detected by liquid biopsies. Model performance was assessed by comparison with the reference origin for each plasma variant determined from matched white blood cell and tumor NGS. Following the locking of the model parameters, we applied plasmaCHORD to an independent validation cohort of 1,418 plasma variants detected in 114 patients with metastatic cancers, as well as to cfDNA NGS from patients enrolled in a prospective clinical trial (NCT05585684). RESULTS: plasmaCHORD predicted tumor origin versus CH origin in the training set with high accuracy (AUC = 0.94). In the independent validation cohort, the locked model maintained similar overall accuracy (AUC = 0.9) and demonstrated significant improvement in accuracy for clinically significant genes. When applied to clinically challenging cases in the context of a precision oncology clinical trial, plasmaCHORD precisely determined variant origin, preventing mismatches with genotype-targeted therapies. CONCLUSIONS: plasmaCHORD, a multifeature machine learning model, can significantly enhance the ability to identify bona fide tumor variants in routine plasma-only NGS, addressing a critical need for implementing liquid biopsy-guided therapy by minimizing misinterpretation caused by CH.
UR - https://www.scopus.com/pages/publications/105037886678
U2 - 10.1158/1078-0432.CCR-25-0976
DO - 10.1158/1078-0432.CCR-25-0976
M3 - Article
C2 - 42001480
AN - SCOPUS:105037886678
SN - 1078-0432
VL - 32
SP - 1729
EP - 1744
JO - Clinical cancer research : an official journal of the American Association for Cancer Research
JF - Clinical cancer research : an official journal of the American Association for Cancer Research
IS - 9
ER -