Bicocca Open Archive

Supervised learning is an important branch of machine learning (ML), which requires a complete annotation (labeling) of the involved training data. This assumption is relaxed in the settings of weakly supervised learning, where labels are allowed to be imprecise or partial. In this article, we study the setting of superset learning, in which instances are assumed to be labeled with a set of possible annotations containing the correct one. We tackle the problem of learning from such data in the context of rough set theory (RST). More specifically, we consider the problem of RST-based feature reduction as a suitable means for data disambiguation, i.e., for the purpose of figuring out the most plausible precise instantiation of the imprecise training data. To this end, we define appropriate generalizations of decision tables and reducts, using tools from generalized information theory and belief function theory. Moreover, we analyze the computational complexity and theoretical properties of the associated computational problems. Finally, we present results of a series of experiments, in which we analyze the proposed concepts empirically and compare our methods with a state-of-the-art dimensionality reduction algorithm, reporting a statistically significant improvement in predictive accuracy.

Campagner, A., Ciucci, D., Hullermeier, E. (2021). Rough set-based feature selection for weakly labeled data. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 136(September 2021), 150-167 [10.1016/j.ijar.2021.06.005].

Rough set-based feature selection for weakly labeled data

Campagner A.;Ciucci D.;Hullermeier E.

2021

Abstract

Supervised learning is an important branch of machine learning (ML), which requires a complete annotation (labeling) of the involved training data. This assumption is relaxed in the settings of weakly supervised learning, where labels are allowed to be imprecise or partial. In this article, we study the setting of superset learning, in which instances are assumed to be labeled with a set of possible annotations containing the correct one. We tackle the problem of learning from such data in the context of rough set theory (RST). More specifically, we consider the problem of RST-based feature reduction as a suitable means for data disambiguation, i.e., for the purpose of figuring out the most plausible precise instantiation of the imprecise training data. To this end, we define appropriate generalizations of decision tables and reducts, using tools from generalized information theory and belief function theory. Moreover, we analyze the computational complexity and theoretical properties of the associated computational problems. Finally, we present results of a series of experiments, in which we analyze the proposed concepts empirically and compare our methods with a state-of-the-art dimensionality reduction algorithm, reporting a statistically significant improvement in predictive accuracy.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Entropy; Evidence Theory; Feature Selection; Rough Sets; Superset Learning;
			
	Lingua del contenuto
	
				English
			
	Data ahead of print o Data prima pubblicazione Online
	
				18-giu-2021
			
	Data di pubblicazione
	
				2021
			
	Rivista
	
				INTERNATIONAL JOURNAL OF APPROXIMATE REASONING
			
	Numero del volume
	
				136
			
	Fascicolo
	
				September 2021
			
	Pagina iniziale
	
				150
			
	Pagina finale
	
				167
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1016/j.ijar.2021.06.005
			
	Fulltext
	
				open
			
	Citazione
	
				Campagner, A., Ciucci, D., Hullermeier, E. (2021). Rough set-based feature selection for weakly labeled data. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 136(September 2021), 150-167 [10.1016/j.ijar.2021.06.005].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Superset_Learning_and_Rough_Sets___Extended.pdf accesso aperto Tipologia di allegato: Submitted Version (Pre-print) Dimensione 370.27 kB Formato Adobe PDF Visualizza/Apri	370.27 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/324845

Citazioni

38

38

Social impact