TY - JOUR
T1 - Alignment of vaccine codes using an ontology of vaccine descriptions
AU - Becker, Benedikt F. H.
AU - Kors, Jan A.
AU - van Mulligen, Erik M.
AU - Sturkenboom, Miriam C. J. M.
N1 - Funding Information:
This work was supported by the Innovative Medicines Initiative Joint Undertaking in the ADVANCE project under grant number 115557, with financial contribution from the European Union’s Seventh Framework Programme and EFPIA companies in-kind contribution.
Publisher Copyright:
© 2022, The Author(s).
PY - 2022/10/18
Y1 - 2022/10/18
N2 - Background: Vaccine information in European electronic health record (EHR) databases is represented using various clinical and database-specific coding systems and drug vocabularies. The lack of harmonization constitutes a challenge in reusing EHR data in collaborative benefit-risk studies about vaccines. Methods: We designed an ontology of the properties that are commonly used in vaccine descriptions, called Ontology of Vaccine Descriptions (VaccO), with a dictionary for the analysis of multilingual vaccine descriptions. We implemented five algorithms for the alignment of vaccine coding systems, i.e., the identification of corresponding codes from different coding ystems, based on an analysis of the code descriptors. The algorithms were evaluated by comparing their results with manually created alignments in two reference sets including clinical and database-specific coding systems with multilingual code descriptors. Results: The best-performing algorithm represented code descriptors as logical statements about entities in the VaccO ontology and used an ontology reasoner to infer common properties and identify corresponding vaccine codes. The evaluation demonstrated excellent performance of the approach (F-scores 0.91 and 0.96). Conclusion: The VaccO ontology allows the identification, representation, and comparison of heterogeneous descriptions of vaccines. The automatic alignment of vaccine coding systems can accelerate the readiness of EHR databases in collaborative vaccine studies.
AB - Background: Vaccine information in European electronic health record (EHR) databases is represented using various clinical and database-specific coding systems and drug vocabularies. The lack of harmonization constitutes a challenge in reusing EHR data in collaborative benefit-risk studies about vaccines. Methods: We designed an ontology of the properties that are commonly used in vaccine descriptions, called Ontology of Vaccine Descriptions (VaccO), with a dictionary for the analysis of multilingual vaccine descriptions. We implemented five algorithms for the alignment of vaccine coding systems, i.e., the identification of corresponding codes from different coding ystems, based on an analysis of the code descriptors. The algorithms were evaluated by comparing their results with manually created alignments in two reference sets including clinical and database-specific coding systems with multilingual code descriptors. Results: The best-performing algorithm represented code descriptors as logical statements about entities in the VaccO ontology and used an ontology reasoner to infer common properties and identify corresponding vaccine codes. The evaluation demonstrated excellent performance of the approach (F-scores 0.91 and 0.96). Conclusion: The VaccO ontology allows the identification, representation, and comparison of heterogeneous descriptions of vaccines. The automatic alignment of vaccine coding systems can accelerate the readiness of EHR databases in collaborative vaccine studies.
KW - Alignment
KW - Coding systems
KW - Ontology
KW - Vaccines
UR - http://www.scopus.com/inward/record.url?scp=85140171024&partnerID=8YFLogxK
U2 - 10.1186/s13326-022-00278-0
DO - 10.1186/s13326-022-00278-0
M3 - Article
C2 - 36258262
SN - 2041-1480
VL - 13
SP - 1
EP - 12
JO - JOURNAL OF BIOMEDICAL SEMANTICS
JF - JOURNAL OF BIOMEDICAL SEMANTICS
IS - 1
M1 - 24
ER -