In recent years, Machine Learning (ML) has attracted wide interest as aid for decision makers in complex domains, such as medicine. Although domain experts are typically aware of the intrinsic uncertainty around it, the issue of Ground Truth (GT) quality has scarcely been addressed in the ML literature. GT quality is regularly assumed to be adequate, regardless of the number and skills of raters involved in data annotation. These factors can, however, potentially have a severe negative impact on the reliability of ML models. In this article we study the influence of GT quality, in terms of number of raters, their expertise, and their agreement level, on the performance of ML models. We introduce the concept of reduction: computational procedures by which to produce single-target GT from multi-rater settings. We propose three reductions, based on three-way decision, possibility theory, and probability theory. We provide characterizations of these reductions from the perspective of learning theory and propose two ML algorithms. We report the result of experiments, on both real-world medical and synthetic datasets, showing that GT quality strongly impacts on the performance of ML models, and that the proposed algorithms can better handle this form of uncertainty compared with state-of-the-art approaches.

Campagner, A., Ciucci, D., Svensson, C., Figge, M., Cabitza, F. (2021). Ground truthing from multi-rater labeling with three-way decision and possibility theory. INFORMATION SCIENCES, 545, 771-790 [10.1016/j.ins.2020.09.049].

Ground truthing from multi-rater labeling with three-way decision and possibility theory

Campagner A.
;
Ciucci D.;Cabitza F.
2021

Abstract

In recent years, Machine Learning (ML) has attracted wide interest as aid for decision makers in complex domains, such as medicine. Although domain experts are typically aware of the intrinsic uncertainty around it, the issue of Ground Truth (GT) quality has scarcely been addressed in the ML literature. GT quality is regularly assumed to be adequate, regardless of the number and skills of raters involved in data annotation. These factors can, however, potentially have a severe negative impact on the reliability of ML models. In this article we study the influence of GT quality, in terms of number of raters, their expertise, and their agreement level, on the performance of ML models. We introduce the concept of reduction: computational procedures by which to produce single-target GT from multi-rater settings. We propose three reductions, based on three-way decision, possibility theory, and probability theory. We provide characterizations of these reductions from the perspective of learning theory and propose two ML algorithms. We report the result of experiments, on both real-world medical and synthetic datasets, showing that GT quality strongly impacts on the performance of ML models, and that the proposed algorithms can better handle this form of uncertainty compared with state-of-the-art approaches.
Articolo in rivista - Articolo scientifico
Machine learning; Multi-rater; Possibility theory; Three-way decision; Uncertainty;
Machine learning; Multi-rater; Possibility theory; Three-way decision; Uncertainty
English
28-set-2020
2021
545
771
790
none
Campagner, A., Ciucci, D., Svensson, C., Figge, M., Cabitza, F. (2021). Ground truthing from multi-rater labeling with three-way decision and possibility theory. INFORMATION SCIENCES, 545, 771-790 [10.1016/j.ins.2020.09.049].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/289890
Citazioni
  • Scopus 38
  • ???jsp.display-item.citation.isi??? 29
Social impact