Bicocca Open Archive

In this article we study the problem of learning from fuzzy labels (LFL), a form of weakly supervised learning in which the supervision target is not precisely specified but is instead given in the form of possibility distributions, that express the imprecise knowledge of the annotating agent. While several approaches for LFL have been proposed in the literature, including generalized risk minimization (GRM), instance-based methods and pseudo label-based learning, both their theoretical properties and their empirical performance have scarcely been studied. We address this gap by: first, presenting a review of the previous results relative to the sample complexity and generalization bounds for GRM and instance-based methods; second, studying both their computational complexity, by proving in particular the impossibility of efficiently solving LFL using GRM, as well as impossibility theorems. We then propose a novel pseudo label-based learning method, called Random Resampling-based Learning (RRL), which directly draws from ensemble learning and possibility theory and study its learning- and complexity-theoretic properties, showing that it achieves guarantees similar to those for GRM while being computationally efficient. Finally, we study the empirical performance of several state-of-the-art LFL algorithms on wide set of synthetic and real-world benchmark datasets, by which we confirm the effectiveness of the proposed RRL method. Additionally, we describe directions for future research, and highlight opportunities for further interaction between machine learning and uncertainty representation theories.

Campagner, A. (2024). Learning from fuzzy labels: Theoretical issues and algorithmic solutions. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 171(August 2024) [10.1016/j.ijar.2023.108969].

Learning from fuzzy labels: Theoretical issues and algorithmic solutions

Campagner A.

2024

Abstract

In this article we study the problem of learning from fuzzy labels (LFL), a form of weakly supervised learning in which the supervision target is not precisely specified but is instead given in the form of possibility distributions, that express the imprecise knowledge of the annotating agent. While several approaches for LFL have been proposed in the literature, including generalized risk minimization (GRM), instance-based methods and pseudo label-based learning, both their theoretical properties and their empirical performance have scarcely been studied. We address this gap by: first, presenting a review of the previous results relative to the sample complexity and generalization bounds for GRM and instance-based methods; second, studying both their computational complexity, by proving in particular the impossibility of efficiently solving LFL using GRM, as well as impossibility theorems. We then propose a novel pseudo label-based learning method, called Random Resampling-based Learning (RRL), which directly draws from ensemble learning and possibility theory and study its learning- and complexity-theoretic properties, showing that it achieves guarantees similar to those for GRM while being computationally efficient. Finally, we study the empirical performance of several state-of-the-art LFL algorithms on wide set of synthetic and real-world benchmark datasets, by which we confirm the effectiveness of the proposed RRL method. Additionally, we describe directions for future research, and highlight opportunities for further interaction between machine learning and uncertainty representation theories.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Ensemble learning; Fuzzy labels; Machine learning; Possibility theory; Statistical learning theory; Weakly supervised learning;
			
	Lingua del contenuto
	
				English
			
	Data ahead of print o Data prima pubblicazione Online
	
				21-giu-2023
			
	Data di pubblicazione
	
				2024
			
	Rivista
	
				INTERNATIONAL JOURNAL OF APPROXIMATE REASONING
			
	Numero del volume
	
				171
			
	Fascicolo
	
				August 2024
			
	Article number
	
				108969
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1016/j.ijar.2023.108969
			
	Fulltext
	
				open
			
	Citazione
	
				Campagner, A. (2024). Learning from fuzzy labels: Theoretical issues and algorithmic solutions. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 171(August 2024) [10.1016/j.ijar.2023.108969].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Campagner-2023-IJAR-preprint.pdf accesso aperto Descrizione: Research Article Tipologia di allegato: Submitted Version (Pre-print) Licenza: Altro Dimensione 937.91 kB Formato Adobe PDF Visualizza/Apri	937.91 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/433378

Citazioni

6

3

Social impact