Hierarchical taxonomies are crucial for organizing concepts across domains like healthcare, finance, and economics. However, maintaining their accuracy requires continuous updates, often demanding expert input. To address this, we propose TAXMAP (TAxonomy eXpansion through Collaborative LLM MAPping), a system that autonomously expands hierarchical taxonomies using contextual word embeddings and three generative models, with a human validation step ensuring relevance. Implemented within an EU initiative to enhance the European Skill taxonomy (ESCO), our framework processed over 40,000 digital terms gathered from the Web, aligning ESCO skills with labor market needs. This effort resulted in 924 proposed terms, 757 of which were validated by experts as correct. By leveraging large language models (LLMs) as encoders, TAXMAP reduces errors, ensures high precision, and minimizes human effort. Compared to a baseline based on ESCO's hierarchy, our system achieved an 81% Positive Predictive Value (PPV) when combining all three models.

D'Amico, S., De Santo, A., Mezzanzanica, M., Mercorio, F. (2025). Taxonomy Expansion through Collaborative LLM Mapping. In SAC '25: Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing (pp.1961-1968) [10.1145/3672608.3707906].

Taxonomy Expansion through Collaborative LLM Mapping

D'Amico, Simone;De Santo, Alessia;Mezzanzanica, Mario;Mercorio, Fabio
2025

Abstract

Hierarchical taxonomies are crucial for organizing concepts across domains like healthcare, finance, and economics. However, maintaining their accuracy requires continuous updates, often demanding expert input. To address this, we propose TAXMAP (TAxonomy eXpansion through Collaborative LLM MAPping), a system that autonomously expands hierarchical taxonomies using contextual word embeddings and three generative models, with a human validation step ensuring relevance. Implemented within an EU initiative to enhance the European Skill taxonomy (ESCO), our framework processed over 40,000 digital terms gathered from the Web, aligning ESCO skills with labor market needs. This effort resulted in 924 proposed terms, 757 of which were validated by experts as correct. By leveraging large language models (LLMs) as encoders, TAXMAP reduces errors, ensures high precision, and minimizes human effort. Compared to a baseline based on ESCO's hierarchy, our system achieved an 81% Positive Predictive Value (PPV) when combining all three models.
paper
LLM; machine learning; taxonomy enrichment
English
SAC'25: Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing - 31 March 2025- 4 April 2025
2025
SAC '25: Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing
9798400706295
2025
1961
1968
none
D'Amico, S., De Santo, A., Mezzanzanica, M., Mercorio, F. (2025). Taxonomy Expansion through Collaborative LLM Mapping. In SAC '25: Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing (pp.1961-1968) [10.1145/3672608.3707906].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/555061
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact