Applying natural language processing to patient messages to identify depression concerns in cancer patients

Marieke M. van Buchem*, Anne A.H. de Hond, Claudio Fanconi, Vaibhavi Shah, Max Schuessler, Ilse M.J. Kant, Ewout W. Steyerberg, Tina Hernandez-Boussard

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

OBJECTIVE: This study aims to explore and develop tools for early identification of depression concerns among cancer patients by leveraging the novel data source of messages sent through a secure patient portal. MATERIALS AND METHODS: We developed classifiers based on logistic regression (LR), support vector machines (SVMs), and 2 Bidirectional Encoder Representations from Transformers (BERT) models (original and Reddit-pretrained) on 6600 patient messages from a cancer center (2009-2022), annotated by a panel of healthcare professionals. Performance was compared using AUROC scores, and model fairness and explainability were examined. We also examined correlations between model predictions and depression diagnosis and treatment. RESULTS: BERT and RedditBERT attained AUROC scores of 0.88 and 0.86, respectively, compared to 0.79 for LR and 0.83 for SVM. BERT showed bigger differences in performance across sex, race, and ethnicity than RedditBERT. Patients who sent messages classified as concerning had a higher chance of receiving a depression diagnosis, a prescription for antidepressants, or a referral to the psycho-oncologist. Explanations from BERT and RedditBERT differed, with no clear preference from annotators. DISCUSSION: We show the potential of BERT and RedditBERT in identifying depression concerns in messages from cancer patients. Performance disparities across demographic groups highlight the need for careful consideration of potential biases. Further research is needed to address biases, evaluate real-world impacts, and ensure responsible integration into clinical settings. CONCLUSION: This work represents a significant methodological advancement in the early identification of depression concerns among cancer patients. Our work contributes to a route to reduce clinical burden while enhancing overall patient care, leveraging BERT-based models.

Original languageEnglish
Pages (from-to)2255-2262
Number of pages8
JournalJournal of the American Medical Informatics Association
Volume31
Issue number10
DOIs
Publication statusPublished - 1 Oct 2024

Keywords

  • machine learning
  • mental health
  • natural language processing
  • oncology

Cite this