Distribution-Based Similarity Measures Applied to Laboratory Results Matching. - Université de Lille
Article Dans Une Revue Studies in Health Technology and Informatics Année : 2021

Distribution-Based Similarity Measures Applied to Laboratory Results Matching.

Résumé

The use of international laboratory terminologies inside hospital information systems is required to conduct data reuse analyses through inter-hospital databases. While most terminology matching techniques performing semantic interoperability are language-based, another strategy is to use distribution matching that performs terms matching based on the statistical similarity. In this work, our objective is to design and assess a structured framework to perform distribution matching on concepts described by continuous variables. We propose a framework that combines distribution matching and machine learning techniques. Using a training sample consisting of correct and incorrect correspondences between different terminologies, a match probability score is built. For each term, best candidates are returned and sorted in decreasing order using the probability given by the model. Searching 101 terms from Lille University Hospital among the same list of concepts in MIMIC-III, the model returned the correct match in the top 5 candidates for 96 of them (95%). Using this open-source framework with a top-k suggestions system could make the expert validation of terminologies alignment easier.
Fichier principal
Vignette du fichier
SHTI-287-SHTI210823.pdf (233.63 Ko) Télécharger le fichier
Origine Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-04387914 , version 1 (11-01-2024)

Licence

Identifiants

Citer

M. Courtois, A. Filiot, Gregoire Ficheur. Distribution-Based Similarity Measures Applied to Laboratory Results Matching.. Studies in Health Technology and Informatics, 2021, Studies in Health Technology and Informatics, 287, pp.94-98. ⟨10.3233/SHTI210823⟩. ⟨hal-04387914⟩

Collections

UNIV-LILLE
2 Consultations
3 Téléchargements

Altmetric

Partager

More