Exploring Language Markers of Mental Health in Psychiatric Stories

Marco Spruit*, Stephanie Verkleij, Kees de Schepper, Floortje Scheepers

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

10 Downloads (Pure)


Diagnosing mental disorders is complex due to the genetic, environmental and psychological contributors and the individual risk factors. Language markers for mental disorders can help to diagnose a person. Research thus far on language markers and the associated mental disorders has been done mainly with the Linguistic Inquiry and Word Count (LIWC) program. In order to improve on this research, we employed a range of Natural Language Processing (NLP) techniques using LIWC, spaCy, fastText and RobBERT to analyse Dutch psychiatric interview transcriptions with both rule-based and vector-based approaches. Our primary objective was to predict whether a patient had been diagnosed with a mental disorder, and if so, the specific mental disorder type. Furthermore, the second goal of this research was to find out which words are language markers for which mental disorder. LIWC in combination with the random forest classification algorithm performed best in predicting whether a person had a mental disorder or not (accuracy: 0.952; Cohen’s kappa: 0.889). SpaCy in combination with random forest predicted best which particular mental disorder a patient had been diagnosed with (accuracy: 0.429; Cohen’s kappa: 0.304).

Original languageEnglish
Article number2179
Pages (from-to)1-17
JournalApplied Sciences (Switzerland)
Issue number4
Publication statusPublished - 1 Feb 2022


  • Deep learning
  • FastText
  • Language marker
  • LIME
  • LIWC
  • Mental disorder
  • RobBERT
  • SpaCy


Dive into the research topics of 'Exploring Language Markers of Mental Health in Psychiatric Stories'. Together they form a unique fingerprint.

Cite this