Skip to main navigation Skip to search Skip to main content

Ensuring generalizability and clinical utility in mental health care applications: Robust artificial intelligence-based treatment predictions in diverse psychosis populations

  • Fiona Coutts
  • , Sergio Mena
  • , Esin Ucur
  • , W Wolfgang Fleischhacker
  • , Rene Kahn
  • , Jeffrey Lieberman
  • , Alkomiet Hasan
  • , Oliver Howes
  • , Christoph Correll
  • , Nikolaos Koutsouleris
  • , Paris Alexandros Lalousis

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

AIM: Artificial Intelligence (AI)-based prediction models of treatment response promise to revolutionize psychiatric care by enabling personalized treatment, but very few have been thoroughly tested in different samples or compared to current clinical standards. Here we present models predicting antipsychotic response and assess their clinical utility in a robust methodological framework.

METHODS: Machine learning models were trained and cross-validated on clinical and sociodemographic data from 594 individuals with established schizophrenia (NCT00014001) and 323 individuals with first episode psychosis (NCT03510325). Models predicted four measures of antipsychotic response at 3 months after baseline. Clinical utility was assessed using decision curve and calibration curve analyses. Model performance was tested in a reduced feature space and across sex, ethnicity, antipsychotic, and symptom change subgroups to investigate model fairness.

RESULTS: Models predicting total symptom severity (r = 0.4-0.68) and symptomatic remission (BAC = 62.4%-69%) performed well in both samples and externally validated successfully in the opposing cohort (r = 0.4-0.5, BAC = 63.5%-65.7%). Performance remained significant when the models were reduced to 8-9 key variables (r = 0.53 for total symptom severity, BAC = 65.3% for symptomatic remission). Models predicting symptomatic remission had a net benefit across risk thresholds of 0.5-0.9 and were moderately well-calibrated (ECE = 0.16-0.18). Model performance different across sex, ethnicity and medication subgroups.

CONCLUSIONS: We present a robust framework for training and assessing the clinical utility of prediction models in psychiatry. Our models generalize across different psychosis populations and show promising calibration and net benefit. However, performance disparities across demographic and treatment subgroups highlight the need for more diverse clinical samples to ensure equitable prediction.

Original languageEnglish
Pages (from-to)64-75
Number of pages12
JournalPsychiatry and Clinical Neurosciences
Volume80
Issue number1
Early online date6 Nov 2025
DOIs
Publication statusPublished - Jan 2026
Externally publishedYes

Keywords

  • AI
  • antipsychotics
  • psychosis
  • translational
  • treatment response

Fingerprint

Dive into the research topics of 'Ensuring generalizability and clinical utility in mental health care applications: Robust artificial intelligence-based treatment predictions in diverse psychosis populations'. Together they form a unique fingerprint.

Cite this