In this article we study the problem of learning from fuzzy labels (LFL), a form of weakly supervised learning in which the supervision target is not precisely specified but is instead given in the form of possibility distributions, that express the imprecise knowledge of the annotating agent. While several approaches for LFL have been proposed in the literature, including generalized risk minimization (GRM), instance-based methods and pseudo label-based learning, both their theoretical properties and their empirical performance have scarcely been studied. We address this gap by: first, presenting a review of the previous results relative to the sample complexity and generalization bounds for GRM and instance-based methods; second, studying both their computational complexity, by proving in particular the impossibility of efficiently solving LFL using GRM, as well as impossibility theorems. We then propose a novel pseudo label-based learning method, called Random Resampling-based Learning (RRL), which directly draws from ensemble learning and possibility theory and study its learning- and complexity-theoretic properties, showing that it achieves guarantees similar to those for GRM while being computationally efficient. Finally, we study the empirical performance of several state-of-the-art LFL algorithms on wide set of synthetic and real-world benchmark datasets, by which we confirm the effectiveness of the proposed RRL method. Additionally, we describe directions for future research, and highlight opportunities for further interaction between machine learning and uncertainty representation theories.

Campagner, A. (2023). Learning from fuzzy labels: Theoretical issues and algorithmic solutions. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING [10.1016/j.ijar.2023.108969].

Learning from fuzzy labels: Theoretical issues and algorithmic solutions

Campagner A.
2023

Abstract

In this article we study the problem of learning from fuzzy labels (LFL), a form of weakly supervised learning in which the supervision target is not precisely specified but is instead given in the form of possibility distributions, that express the imprecise knowledge of the annotating agent. While several approaches for LFL have been proposed in the literature, including generalized risk minimization (GRM), instance-based methods and pseudo label-based learning, both their theoretical properties and their empirical performance have scarcely been studied. We address this gap by: first, presenting a review of the previous results relative to the sample complexity and generalization bounds for GRM and instance-based methods; second, studying both their computational complexity, by proving in particular the impossibility of efficiently solving LFL using GRM, as well as impossibility theorems. We then propose a novel pseudo label-based learning method, called Random Resampling-based Learning (RRL), which directly draws from ensemble learning and possibility theory and study its learning- and complexity-theoretic properties, showing that it achieves guarantees similar to those for GRM while being computationally efficient. Finally, we study the empirical performance of several state-of-the-art LFL algorithms on wide set of synthetic and real-world benchmark datasets, by which we confirm the effectiveness of the proposed RRL method. Additionally, we describe directions for future research, and highlight opportunities for further interaction between machine learning and uncertainty representation theories.
Articolo in rivista - Articolo scientifico
Ensemble learning; Fuzzy labels; Machine learning; Possibility theory; Statistical learning theory; Weakly supervised learning;
English
21-giu-2023
2023
108969
open
Campagner, A. (2023). Learning from fuzzy labels: Theoretical issues and algorithmic solutions. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING [10.1016/j.ijar.2023.108969].
File in questo prodotto:
File Dimensione Formato  
Campagner-2023-IJAR-preprint.pdf

accesso aperto

Descrizione: Research Article
Tipologia di allegato: Submitted Version (Pre-print)
Licenza: Altro
Dimensione 937.91 kB
Formato Adobe PDF
937.91 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/433378
Citazioni
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
Social impact