In 2012, the United Nations introduced 17 Sustainable Development Goals (SDGs) aimed at creating a more sustainable and improved future by 2030. However, tracking progress toward these goals is difficult because of the extensive scale and complexity of the data involved. Text classification models have become vital tools in this area, automating the analysis of vast amounts of text from a variety of sources. Additionally, large language models (LLMs) have recently proven indispensable for many natural language processing tasks, including text classification, thanks to their ability to recognize complex linguistic patterns and semantics. This study analyzes various proprietary and open-source LLMs for a single-label, multi-class text classification task focused on the SDGs. Then, it also evaluates the effectiveness of task adaptation techniques (i.e., in-context learning approaches), namely Zero-Shot and Few-Shot Learning, as well as Fine-Tuning within this domain. The principle of the proposed method relies on leveraging LLMs to automatically assign relevant SDG labels to input texts, enabling scalable, consistent, and efficient monitoring of SDG-related content across different sources. By employing in-context learning and prompt engineering, the study investigates how smaller, more accessible models can achieve high performance with minimal labeled data. Quantitative experiments demonstrate that, on the SDG text classification task, smaller models (such as flan-t5-large) with prompt optimization can achieve macro F1-scores up to 0.75, closely matching the performance of much larger models like gpt-3.5, which attained a macro F1-score of 0.77. Few-shot learning further improved results for challenging classes, reducing the performance gap between open-source and proprietary LLMs. The results reveal that smaller models, when optimized through prompt engineering, can perform on par with larger models like OpenAI's GPT (Generative Pre-trained Transformer). These findings suggest that, with proper prompt and task adaptation, open-source LLMs can offer a competitive and more accessible alternative for SDG classification, paving the way for broader and cost-effective adoption of automated SDG monitoring tools.

Cadeddu, A., Chessa, A., De Leo, V., Fenu, G., Motta, E., Osborne, F., et al. (2025). A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals. IEEE ACCESS, 13, 175271-175291 [10.1109/ACCESS.2025.3618017].

A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals

Osborne F.;
2025

Abstract

In 2012, the United Nations introduced 17 Sustainable Development Goals (SDGs) aimed at creating a more sustainable and improved future by 2030. However, tracking progress toward these goals is difficult because of the extensive scale and complexity of the data involved. Text classification models have become vital tools in this area, automating the analysis of vast amounts of text from a variety of sources. Additionally, large language models (LLMs) have recently proven indispensable for many natural language processing tasks, including text classification, thanks to their ability to recognize complex linguistic patterns and semantics. This study analyzes various proprietary and open-source LLMs for a single-label, multi-class text classification task focused on the SDGs. Then, it also evaluates the effectiveness of task adaptation techniques (i.e., in-context learning approaches), namely Zero-Shot and Few-Shot Learning, as well as Fine-Tuning within this domain. The principle of the proposed method relies on leveraging LLMs to automatically assign relevant SDG labels to input texts, enabling scalable, consistent, and efficient monitoring of SDG-related content across different sources. By employing in-context learning and prompt engineering, the study investigates how smaller, more accessible models can achieve high performance with minimal labeled data. Quantitative experiments demonstrate that, on the SDG text classification task, smaller models (such as flan-t5-large) with prompt optimization can achieve macro F1-scores up to 0.75, closely matching the performance of much larger models like gpt-3.5, which attained a macro F1-score of 0.77. Few-shot learning further improved results for challenging classes, reducing the performance gap between open-source and proprietary LLMs. The results reveal that smaller models, when optimized through prompt engineering, can perform on par with larger models like OpenAI's GPT (Generative Pre-trained Transformer). These findings suggest that, with proper prompt and task adaptation, open-source LLMs can offer a competitive and more accessible alternative for SDG classification, paving the way for broader and cost-effective adoption of automated SDG monitoring tools.
Articolo in rivista - Articolo scientifico
Large language models; sustainable development goals; text classification;
English
6-ott-2025
2025
13
175271
175291
open
Cadeddu, A., Chessa, A., De Leo, V., Fenu, G., Motta, E., Osborne, F., et al. (2025). A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals. IEEE ACCESS, 13, 175271-175291 [10.1109/ACCESS.2025.3618017].
File in questo prodotto:
File Dimensione Formato  
Cadeddu et al-2025-IEEE Access-VoR.pdf

accesso aperto

Descrizione: A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 2.7 MB
Formato Adobe PDF
2.7 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/572281
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
Social impact