Exploring Embedding Spaces for more Coherent Topic Modeling in Electronic Health Records

Emil Rijcken, Kalliopi Zervanou, Marco Spruit, Pablo Mosteiro, Floortje Scheepers, Uzay Kaymak

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

The written notes in the Electronic Health Records contain a vast amount of information about patients. Implementing automated approaches for text classification tasks requires the automated methods to be well-interpretable, and topic models can be used for this goal as they can indicate what topics in a text are relevant to making a decision. We propose a new topic modeling algorithm, FLSA-E, and compare it with another state-of-the-art algorithm FLSA-W. In FLSA-E, topics are found by fuzzy clustering in a word embedding space. Since we use word embeddings as the basis for our clustering, we extend our evaluation with word-embeddings-based evaluation metrics. We find that different evaluation metrics favour different algorithms. Based on the results, there is evidence that FLSA-E has fewer outliers in its topics, a desirable property, given that within-topic words need to be semantically related.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2669-2674
Number of pages6
ISBN (Electronic)9781665452588
DOIs
Publication statusPublished - 2022
Event2022 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2022 - Prague, Czech Republic
Duration: 9 Oct 202212 Oct 2022

Publication series

NameConference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
Volume2022-October
ISSN (Print)1062-922X

Conference

Conference2022 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2022
Country/TerritoryCzech Republic
CityPrague
Period9/10/2212/10/22

Keywords

  • Electronic Health Records
  • Fuzzy Clustering
  • Fuzzy Methods
  • Natural Language Processing
  • Neural Network methods
  • Psychiatry
  • Topic Modeling
  • Word Embeddings

Fingerprint

Dive into the research topics of 'Exploring Embedding Spaces for more Coherent Topic Modeling in Electronic Health Records'. Together they form a unique fingerprint.

Cite this