Bicocca Open Archive

Hierarchical taxonomies are crucial for organizing concepts across domains like healthcare, finance, and economics. However, maintaining their accuracy requires continuous updates, often demanding expert input. To address this, we propose TAXMAP (TAxonomy eXpansion through Collaborative LLM MAPping), a system that autonomously expands hierarchical taxonomies using contextual word embeddings and three generative models, with a human validation step ensuring relevance. Implemented within an EU initiative to enhance the European Skill taxonomy (ESCO), our framework processed over 40,000 digital terms gathered from the Web, aligning ESCO skills with labor market needs. This effort resulted in 924 proposed terms, 757 of which were validated by experts as correct. By leveraging large language models (LLMs) as encoders, TAXMAP reduces errors, ensures high precision, and minimizes human effort. Compared to a baseline based on ESCO's hierarchy, our system achieved an 81% Positive Predictive Value (PPV) when combining all three models.

D'Amico, S., De Santo, A., Mezzanzanica, M., Mercorio, F. (2025). Taxonomy Expansion through Collaborative LLM Mapping. In SAC '25: Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing (pp.1961-1968). Association for Computing Machinery [10.1145/3672608.3707906].

Taxonomy Expansion through Collaborative LLM Mapping

D'Amico, Simone;De Santo, Alessia;Mezzanzanica, Mario;Mercorio, Fabio

2025

Abstract

Hierarchical taxonomies are crucial for organizing concepts across domains like healthcare, finance, and economics. However, maintaining their accuracy requires continuous updates, often demanding expert input. To address this, we propose TAXMAP (TAxonomy eXpansion through Collaborative LLM MAPping), a system that autonomously expands hierarchical taxonomies using contextual word embeddings and three generative models, with a human validation step ensuring relevance. Implemented within an EU initiative to enhance the European Skill taxonomy (ESCO), our framework processed over 40,000 digital terms gathered from the Web, aligning ESCO skills with labor market needs. This effort resulted in 924 proposed terms, 757 of which were validated by experts as correct. By leveraging large language models (LLMs) as encoders, TAXMAP reduces errors, ensures high precision, and minimizes human effort. Compared to a baseline based on ESCO's hierarchy, our system achieved an 81% Positive Predictive Value (PPV) when combining all three models.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				automated taxonomy enrichment; labour market intelligence; large language models; NLP;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				SAC'25: Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing - 31 March 2025- 4 April 2025
			
	Anno del convegno
	
				2025
			
	Titolo degli atti
	
				SAC '25: Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing
			
	ISBN del volume degli atti
	
				9798400706295
			
	Data di pubblicazione
	
				2025
			
	Pagina iniziale
	
				1961
			
	Pagina finale
	
				1968
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1145/3672608.3707906
			
	Fulltext
	
				none
			
	Citazione
	
				D'Amico, S., De Santo, A., Mezzanzanica, M., Mercorio, F. (2025). Taxonomy Expansion through Collaborative LLM Mapping. In SAC '25: Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing (pp.1961-1968). Association for Computing Machinery [10.1145/3672608.3707906].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/555061

Citazioni

0

0

Social impact