In 2012, the United Nations set 17 Sustainable Development Goals (SDGs) to build a better future by 2030, but monitoring progress is challenging due to data complexity. Recent Large Language Models (LLMs) have significantly improved Natural Language Processing tasks, including text classification. This study evaluates only open-weight LLMs for single-label, multi-class SDG text classification, comparing Zero-Shot, Few-Shot, and Fine-Tuning approaches. Our goal is to determine whether smaller, resource-efficient models, optimized through prompt engineering, can obtain competitive results on a challenging dataset. Using a benchmark dataset from the Open SDG initiative, our findings show that with effective prompt engineering, small models can significantly achieve competitive performance.
Cadeddu, A., Chessa, A., De Leo, V., Fenu, G., Motta, E., Osborne, F., et al. (2025). Benchmarking Large Language Models for Sustainable Development Goals Classification: Evaluating In-Context Learning and Fine-Tuning Strategies. In Third International Workshop on Semantic Technologies and Deep Learning Models for Scientific, Technical and Legal Data (SemTech4STLD 2025) co-located with Extended Semantic Web Conference 2025 (ESWC 2025). CEUR-WS.
Benchmarking Large Language Models for Sustainable Development Goals Classification: Evaluating In-Context Learning and Fine-Tuning Strategies
Osborne F.;
2025
Abstract
In 2012, the United Nations set 17 Sustainable Development Goals (SDGs) to build a better future by 2030, but monitoring progress is challenging due to data complexity. Recent Large Language Models (LLMs) have significantly improved Natural Language Processing tasks, including text classification. This study evaluates only open-weight LLMs for single-label, multi-class SDG text classification, comparing Zero-Shot, Few-Shot, and Fine-Tuning approaches. Our goal is to determine whether smaller, resource-efficient models, optimized through prompt engineering, can obtain competitive results on a challenging dataset. Using a benchmark dataset from the Open SDG initiative, our findings show that with effective prompt engineering, small models can significantly achieve competitive performance.| File | Dimensione | Formato | |
|---|---|---|---|
|
Cadeddu et al-2025-SemTech4STLD-CEUR-VoR.pdf
accesso aperto
Descrizione: Benchmarking Large Language Models for Sustainable Development Goals Classification: Evaluating In-Context Learning and Fine-Tuning Strategies
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
244.38 kB
Formato
Adobe PDF
|
244.38 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


