Skip to main navigation Skip to search Skip to main content

A Multidimensional Framework for Data Quality Assessment in Heart Failure: Integrating IEEE 2801-2022 and Fairness Metrics

  • Marina Georgoula
  • , Grigorios G. Kotoulas
  • , Konstantina Helen Tsarapatsani
  • , Dimitrios G. Boucharas
  • , Ioannis Kyprakis
  • , Dimitrios Manousos
  • , Andrej Preveden
  • , Lazar Velicki
  • , Amy Groenewegen
  • , Frans Rutten
  • , Borut Flis
  • , Matej Pičulin
  • , Peter Vračar
  • , Zoran Bosnić
  • , Maria Tafelmeier
  • , Lars S. Maier
  • , Fausto Barlocco
  • , Iacopo Olivotto
  • , Marta Jimenez-Blanco
  • , Jose Luis Zamorano
  • Duncan Edwards, Prithwish Banerjee, Nduka C. Okwose, Sarah Charman, Djordje G. Jakovljevic, Manolis Tsiknakis, Dimitrios I. Fotiadis

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Heart failure (HF) affects over 64 million people globally and poses complex diagnostic and therapeutic challenges. Reliable clinical research in HF hinges on high-quality data. This study presents a novel data quality assessment (DQA) framework tailored to retrospective HF datasets. It adapts the IEEE standard 2801-2022 criteria - originally for general medical data - to HF's clinical and multimodal structure and introduces a fairness-aware dimension to assess demographic representativeness. Applied to a real-world dataset of 6,039 patients and over 110,000 records across 11 clinical domains, the framework evaluates six dimensions: Completeness, Accuracy, Consistency, Compliance, Timeliness, and Fairness. Initial completeness was low (48.82%), but improved to 61.04% after cleaning via outlier correction, imputation, and schema normalization. Accuracy and compliance reached 100%, and consistency improved to 99.61%. Fairness, measured via JensenShannon Similarity across age, sex, and BMI, remained at 87.35%, highlighting demographic imbalance remained unresolved by technical cleaning. This is the first standards-aligned, domain-adapted, and fairness-extended DQA pipeline for HF, producing a robust dataset suitable for machine learning and clinical decision support.

Original languageEnglish
Title of host publicationProceedings - 2025 IEEE 25th International Conference on Bioinformatics and Bioengineering, BIBE 2025
PublisherIEEE
Pages456-463
Number of pages8
ISBN (Electronic)9798331558994
DOIs
Publication statusPublished - 11 Dec 2025
Event25th IEEE International Conference on Bioinformatics and Bioengineering, BIBE 2025 - Athens, Greece
Duration: 6 Nov 20268 Nov 2026

Conference

Conference25th IEEE International Conference on Bioinformatics and Bioengineering, BIBE 2025
Country/TerritoryGreece
CityAthens
Period6/11/268/11/26

Keywords

  • Clinical Decision Support
  • Data Cleaning
  • Data Quality Assessment
  • Heart Failure
  • Retrospective Clinical Data

Fingerprint

Dive into the research topics of 'A Multidimensional Framework for Data Quality Assessment in Heart Failure: Integrating IEEE 2801-2022 and Fairness Metrics'. Together they form a unique fingerprint.

Cite this