TY - JOUR
T1 - Identification and prediction of difficult-to-treat rheumatoid arthritis patients in structured and unstructured routine care data
T2 - results from a hackathon
AU - Messelink, Marianne A.
AU - Roodenrijs, Nadia M.T.
AU - van Es, Bram
AU - Hulsbergen-Veelken, Cornelia A.R.
AU - Jong, Sebastiaan
AU - Overmars, L. Malin
AU - Reteig, Leon C.
AU - Tan, Sander C.
AU - Tauber, Tjebbe
AU - van Laar, Jacob M.
AU - Welsing, Paco M.J.
AU - Haitjema, Saskia
N1 - Funding Information:
MAM, NMTR, CH, and PMJW declare to have no competing interests. BvE, SJ, and TT are stockholders of MedxAI. LMO, LCR, and SCT are freelancers for MedxAI. JMvL reports personal fees from Arxx Tx, Gesyntha, Magenta, Sanofi Genzyme, Leadiant, Boehringer-Ingelheim, and Galapagos; grants and personal fees from Roche; grants from Astra Zeneca, MSD, and Thermo Fisher; all outside the submitted work. SH is supported by a fellowship of Abbott Diagnostics.
Publisher Copyright:
© 2021, The Author(s).
PY - 2021/7/8
Y1 - 2021/7/8
N2 - Background: The new concept of difficult-to-treat rheumatoid arthritis (D2T RA) refers to RA patients who remain symptomatic after several lines of treatment, resulting in a high patient and economic burden. During a hackathon, we aimed to identify and predict D2T RA patients in structured and unstructured routine care data. Methods: Routine care data of 1873 RA patients were extracted from the Utrecht Patient Oriented Database. Data from a previous cross-sectional study, in which 152 RA patients were clinically classified as either D2T or non-D2T, served as a validation set. Machine learning techniques, text mining, and feature importance analyses were performed to identify and predict D2T RA patients based on structured and unstructured routine care data. Results: We identified 123 potentially new D2T RA patients by applying the D2T RA definition in structured and unstructured routine care data. Additionally, we developed a D2T RA identification model derived from a feature importance analysis of all available structured data (AUC-ROC 0.88 (95% CI 0.82–0.94)), and we demonstrated the potential of longitudinal hematological data to differentiate D2T from non-D2T RA patients using supervised dimension reduction. Lastly, using data up to the time of starting the first biological treatment, we predicted future development of D2TRA (AUC-ROC 0.73 (95% CI 0.71–0.75)). Conclusions: During this hackathon, we have demonstrated the potential of different techniques for the identification and prediction of D2T RA patients in structured as well as unstructured routine care data. The results are promising and should be optimized and validated in future research.
AB - Background: The new concept of difficult-to-treat rheumatoid arthritis (D2T RA) refers to RA patients who remain symptomatic after several lines of treatment, resulting in a high patient and economic burden. During a hackathon, we aimed to identify and predict D2T RA patients in structured and unstructured routine care data. Methods: Routine care data of 1873 RA patients were extracted from the Utrecht Patient Oriented Database. Data from a previous cross-sectional study, in which 152 RA patients were clinically classified as either D2T or non-D2T, served as a validation set. Machine learning techniques, text mining, and feature importance analyses were performed to identify and predict D2T RA patients based on structured and unstructured routine care data. Results: We identified 123 potentially new D2T RA patients by applying the D2T RA definition in structured and unstructured routine care data. Additionally, we developed a D2T RA identification model derived from a feature importance analysis of all available structured data (AUC-ROC 0.88 (95% CI 0.82–0.94)), and we demonstrated the potential of longitudinal hematological data to differentiate D2T from non-D2T RA patients using supervised dimension reduction. Lastly, using data up to the time of starting the first biological treatment, we predicted future development of D2TRA (AUC-ROC 0.73 (95% CI 0.71–0.75)). Conclusions: During this hackathon, we have demonstrated the potential of different techniques for the identification and prediction of D2T RA patients in structured as well as unstructured routine care data. The results are promising and should be optimized and validated in future research.
KW - Applied data analytics in medicine
KW - Difficult-to-treat rheumatoid arthritis
KW - Machine learning
KW - Routine care data
KW - Humans
KW - Machine Learning
KW - Arthritis, Rheumatoid/diagnosis
KW - Databases, Factual
UR - http://www.scopus.com/inward/record.url?scp=85110432675&partnerID=8YFLogxK
U2 - 10.1186/s13075-021-02560-5
DO - 10.1186/s13075-021-02560-5
M3 - Article
C2 - 34238346
AN - SCOPUS:85110432675
SN - 1478-6354
VL - 23
SP - 1
EP - 10
JO - Arthritis Research and Therapy
JF - Arthritis Research and Therapy
IS - 1
M1 - 184
ER -