In 2012, the United Nations set 17 Sustainable Development Goals (SDGs) to build a better future by 2030, but monitoring progress is challenging due to data complexity. Recent Large Language Models (LLMs) have significantly improved Natural Language Processing tasks, including text classification. This study evaluates only open-weight LLMs for single-label, multi-class SDG text classification, comparing Zero-Shot, Few-Shot, and Fine-Tuning approaches. Our goal is to determine whether smaller, resource-efficient models, optimized through prompt engineering, can obtain competitive results on a challenging dataset. Using a benchmark dataset from the Open SDG initiative, our findings show that with effective prompt engineering, small models can significantly achieve competitive performance.

Cadeddu, A., Chessa, A., De Leo, V., Fenu, G., Motta, E., Osborne, F., et al. (2025). Benchmarking Large Language Models for Sustainable Development Goals Classification: Evaluating In-Context Learning and Fine-Tuning Strategies. In Third International Workshop on Semantic Technologies and Deep Learning Models for Scientific, Technical and Legal Data (SemTech4STLD 2025) co-located with Extended Semantic Web Conference 2025 (ESWC 2025). CEUR-WS.

Benchmarking Large Language Models for Sustainable Development Goals Classification: Evaluating In-Context Learning and Fine-Tuning Strategies

Osborne F.;
2025

Abstract

In 2012, the United Nations set 17 Sustainable Development Goals (SDGs) to build a better future by 2030, but monitoring progress is challenging due to data complexity. Recent Large Language Models (LLMs) have significantly improved Natural Language Processing tasks, including text classification. This study evaluates only open-weight LLMs for single-label, multi-class SDG text classification, comparing Zero-Shot, Few-Shot, and Fine-Tuning approaches. Our goal is to determine whether smaller, resource-efficient models, optimized through prompt engineering, can obtain competitive results on a challenging dataset. Using a benchmark dataset from the Open SDG initiative, our findings show that with effective prompt engineering, small models can significantly achieve competitive performance.
paper
Large Language Models; Sustainable Development Goals; Text Classification; United Nations;
English
3rd International Workshop on Semantic Technologies and Deep Learning Models for Scientific, Technical and Legal Data, SemTech4STLD 2025 - June 1st, 2025
2025
Dessi, R; Jeenu, J; Dessi, D; Osborne, F; Aras, H
Third International Workshop on Semantic Technologies and Deep Learning Models for Scientific, Technical and Legal Data (SemTech4STLD 2025) co-located with Extended Semantic Web Conference 2025 (ESWC 2025)
2025
3979
https://ceur-ws.org/Vol-3979/
open
Cadeddu, A., Chessa, A., De Leo, V., Fenu, G., Motta, E., Osborne, F., et al. (2025). Benchmarking Large Language Models for Sustainable Development Goals Classification: Evaluating In-Context Learning and Fine-Tuning Strategies. In Third International Workshop on Semantic Technologies and Deep Learning Models for Scientific, Technical and Legal Data (SemTech4STLD 2025) co-located with Extended Semantic Web Conference 2025 (ESWC 2025). CEUR-WS.
File in questo prodotto:
File Dimensione Formato  
Cadeddu et al-2025-SemTech4STLD-CEUR-VoR.pdf

accesso aperto

Descrizione: Benchmarking Large Language Models for Sustainable Development Goals Classification: Evaluating In-Context Learning and Fine-Tuning Strategies
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 244.38 kB
Formato Adobe PDF
244.38 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/567745
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
Social impact