In this paper, we present and discuss two new measures of inter- and intra-rater agreement to assess the reliability of the raters, and hence of their labeling, in multi-rater setings, which are common in the production of ground truth for machine learning models. Our proposal is more conservative of other existing agreement measures, as it considers a more articulated notion of agreement by chance, based on an empirical estimation of the precision (or reliability) of the single raters involved. We discuss the measures in light of a realistic annotation tasks that involved 13 expert radiologists in labeling the MRNet dataset.

Campagner, A., Cabitza, F. (2020). Introducing new measures of inter- And intra-rater agreement to assess the reliability of medical ground truth. In C.L. Louise B. Pape-Haugaard (a cura di), Digital Personalized Health and Medicine (pp. 282-286). IOS Press [10.3233/SHTI200167].

Introducing new measures of inter- And intra-rater agreement to assess the reliability of medical ground truth

Campagner A.;Cabitza F.
2020

Abstract

In this paper, we present and discuss two new measures of inter- and intra-rater agreement to assess the reliability of the raters, and hence of their labeling, in multi-rater setings, which are common in the production of ground truth for machine learning models. Our proposal is more conservative of other existing agreement measures, as it considers a more articulated notion of agreement by chance, based on an empirical estimation of the precision (or reliability) of the single raters involved. We discuss the measures in light of a realistic annotation tasks that involved 13 expert radiologists in labeling the MRNet dataset.
Capitolo o saggio
Ground Truth; Inter-rater agreement; Machine Learning; Reliability;
English
Digital Personalized Health and Medicine
Louise B. Pape-Haugaard, Christian Lovis, Inge Cort Madsen, Patrick Weber, Per Hostrup Nielsen, Philip Scott
2020
9781643680828
270
IOS Press
282
286
Campagner, A., Cabitza, F. (2020). Introducing new measures of inter- And intra-rater agreement to assess the reliability of medical ground truth. In C.L. Louise B. Pape-Haugaard (a cura di), Digital Personalized Health and Medicine (pp. 282-286). IOS Press [10.3233/SHTI200167].
open
File in questo prodotto:
File Dimensione Formato  
Campagner-2020-DPHM-VoR.pdf

accesso aperto

Descrizione: This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 185.53 kB
Formato Adobe PDF
185.53 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/513664
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
Social impact