Bicocca Open Archive

The integration of Environmental, Social, and Governance (ESG) factors into corporate decision-making is a fundamental aspect of sustainable finance. However, ensuring that business practices align with evolving regulatory frameworks remains a persistent challenge. AI-driven solutions for automatically assessing the alignment of sustainability reports and non-financial disclosures with specific ESG activities could greatly support this process. Yet, this task remains complex due to the limitations of general-purpose Large Language Models (LLMs) in domain-specific contexts and the scarcity of structured, high-quality datasets. In this paper, we investigate the ability of current-generation LLMs to identify text related to environmental activities. Furthermore, we demonstrate that their performance can be significantly enhanced through fine-tuning on a combination of original and synthetically generated data. To this end, we introduce ESG-Activities, a benchmark dataset containing 1,325 labelled text segments classified according to the EU ESG taxonomy. Our experimental results show that fine-tuning on ESG-Activities significantly enhances classification accuracy, with open models such as Llama 7B and Gemma 7B outperforming large proprietary solutions in specific configurations. These findings have important implications for financial analysts, policymakers, and AI researchers seeking to enhance ESG transparency and compliance through advanced natural language processing techniques.

Birti, M., Maurino, A., Osborne, F. (2025). Optimizing Large Language Models for ESG Activity Detection in Financial Texts. In ICAIF '25: Proceedings of the 6th ACM International Conference on AI in Finance (pp.856-863). Association for Computing Machinery, Inc [10.1145/3768292.3770371].

Optimizing Large Language Models for ESG Activity Detection in Financial Texts

Birti M.;Maurino A.;Osborne F.

2025

Abstract

The integration of Environmental, Social, and Governance (ESG) factors into corporate decision-making is a fundamental aspect of sustainable finance. However, ensuring that business practices align with evolving regulatory frameworks remains a persistent challenge. AI-driven solutions for automatically assessing the alignment of sustainability reports and non-financial disclosures with specific ESG activities could greatly support this process. Yet, this task remains complex due to the limitations of general-purpose Large Language Models (LLMs) in domain-specific contexts and the scarcity of structured, high-quality datasets. In this paper, we investigate the ability of current-generation LLMs to identify text related to environmental activities. Furthermore, we demonstrate that their performance can be significantly enhanced through fine-tuning on a combination of original and synthetically generated data. To this end, we introduce ESG-Activities, a benchmark dataset containing 1,325 labelled text segments classified according to the EU ESG taxonomy. Our experimental results show that fine-tuning on ESG-Activities significantly enhances classification accuracy, with open models such as Llama 7B and Gemma 7B outperforming large proprietary solutions in specific configurations. These findings have important implications for financial analysts, policymakers, and AI researchers seeking to enhance ESG transparency and compliance through advanced natural language processing techniques.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				Deep learning; Environmental management; Financial technology; Generative AI; Large Language Models; Machine learning; Natural language processing; Sustainability; Text classification;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				ICAIF '25: 6th ACM International Conference on AI in Finance - November 15 - 18, 2025
			
	Anno del convegno
	
				2025
			
	Titolo degli atti
	
				ICAIF '25: Proceedings of the 6th ACM International Conference on AI in Finance
			
	ISBN del volume degli atti
	
				9798400722202
			
	Data di pubblicazione
	
				2025
			
	Pagina iniziale
	
				856
			
	Pagina finale
	
				863
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1145/3768292.3770371
			
	Fulltext
	
				open
			
	Citazione
	
				Birti, M., Maurino, A., Osborne, F. (2025). Optimizing Large Language Models for ESG Activity Detection in Financial Texts. In ICAIF '25: Proceedings of the 6th ACM International Conference on AI in Finance (pp.856-863). Association for Computing Machinery, Inc [10.1145/3768292.3770371].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
Birti-2025-arXiv-preprint.pdf accesso aperto Tipologia di allegato: Submitted Version (Pre-print) Licenza: Creative Commons Dimensione 561.26 kB Formato Adobe PDF Visualizza/Apri	561.26 kB	Adobe PDF	Visualizza/Apri
Birti et al-2025-ICAIF-Vor.pdf accesso aperto Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 554.58 kB Formato Adobe PDF Visualizza/Apri	554.58 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/584641

Citazioni

1

0

Social impact