Bicocca Open Archive

In 2012, the United Nations introduced 17 Sustainable Development Goals (SDGs) aimed at creating a more sustainable and improved future by 2030. However, tracking progress toward these goals is difficult because of the extensive scale and complexity of the data involved. Text classification models have become vital tools in this area, automating the analysis of vast amounts of text from a variety of sources. Additionally, large language models (LLMs) have recently proven indispensable for many natural language processing tasks, including text classification, thanks to their ability to recognize complex linguistic patterns and semantics. This study analyzes various proprietary and open-source LLMs for a single-label, multi-class text classification task focused on the SDGs. Then, it also evaluates the effectiveness of task adaptation techniques (i.e., in-context learning approaches), namely Zero-Shot and Few-Shot Learning, as well as Fine-Tuning within this domain. The principle of the proposed method relies on leveraging LLMs to automatically assign relevant SDG labels to input texts, enabling scalable, consistent, and efficient monitoring of SDG-related content across different sources. By employing in-context learning and prompt engineering, the study investigates how smaller, more accessible models can achieve high performance with minimal labeled data. Quantitative experiments demonstrate that, on the SDG text classification task, smaller models (such as flan-t5-large) with prompt optimization can achieve macro F1-scores up to 0.75, closely matching the performance of much larger models like gpt-3.5, which attained a macro F1-score of 0.77. Few-shot learning further improved results for challenging classes, reducing the performance gap between open-source and proprietary LLMs. The results reveal that smaller models, when optimized through prompt engineering, can perform on par with larger models like OpenAI's GPT (Generative Pre-trained Transformer). These findings suggest that, with proper prompt and task adaptation, open-source LLMs can offer a competitive and more accessible alternative for SDG classification, paving the way for broader and cost-effective adoption of automated SDG monitoring tools.

Cadeddu, A., Chessa, A., De Leo, V., Fenu, G., Motta, E., Osborne, F., et al. (2025). A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals. IEEE ACCESS, 13, 175271-175291 [10.1109/ACCESS.2025.3618017].

A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals

Cadeddu A.;Chessa A.;De Leo V.;Fenu G.;Motta E.;Osborne F.;Recupero D. R.;Salatino A.;Secchi L.

2025

Abstract

In 2012, the United Nations introduced 17 Sustainable Development Goals (SDGs) aimed at creating a more sustainable and improved future by 2030. However, tracking progress toward these goals is difficult because of the extensive scale and complexity of the data involved. Text classification models have become vital tools in this area, automating the analysis of vast amounts of text from a variety of sources. Additionally, large language models (LLMs) have recently proven indispensable for many natural language processing tasks, including text classification, thanks to their ability to recognize complex linguistic patterns and semantics. This study analyzes various proprietary and open-source LLMs for a single-label, multi-class text classification task focused on the SDGs. Then, it also evaluates the effectiveness of task adaptation techniques (i.e., in-context learning approaches), namely Zero-Shot and Few-Shot Learning, as well as Fine-Tuning within this domain. The principle of the proposed method relies on leveraging LLMs to automatically assign relevant SDG labels to input texts, enabling scalable, consistent, and efficient monitoring of SDG-related content across different sources. By employing in-context learning and prompt engineering, the study investigates how smaller, more accessible models can achieve high performance with minimal labeled data. Quantitative experiments demonstrate that, on the SDG text classification task, smaller models (such as flan-t5-large) with prompt optimization can achieve macro F1-scores up to 0.75, closely matching the performance of much larger models like gpt-3.5, which attained a macro F1-score of 0.77. Few-shot learning further improved results for challenging classes, reducing the performance gap between open-source and proprietary LLMs. The results reveal that smaller models, when optimized through prompt engineering, can perform on par with larger models like OpenAI's GPT (Generative Pre-trained Transformer). These findings suggest that, with proper prompt and task adaptation, open-source LLMs can offer a competitive and more accessible alternative for SDG classification, paving the way for broader and cost-effective adoption of automated SDG monitoring tools.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Large language models; sustainable development goals; text classification;
			
	Lingua del contenuto
	
				English
			
	Data ahead of print o Data prima pubblicazione Online
	
				6-ott-2025
			
	Data di pubblicazione
	
				2025
			
	Rivista
	
				IEEE ACCESS
			
	Numero del volume
	
				13
			
	Pagina iniziale
	
				175271
			
	Pagina finale
	
				175291
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1109/ACCESS.2025.3618017
			
	Fulltext
	
				open
			
	Citazione
	
				Cadeddu, A., Chessa, A., De Leo, V., Fenu, G., Motta, E., Osborne, F., et al. (2025). A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals. IEEE ACCESS, 13, 175271-175291 [10.1109/ACCESS.2025.3618017].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Cadeddu et al-2025-IEEE Access-VoR.pdf accesso aperto Descrizione: A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 2.7 MB Formato Adobe PDF Visualizza/Apri	2.7 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/572281

Citazioni

0

0

Social impact