Bicocca Open Archive

The extraction of named entities from court judgments is useful in several downstream applications, such as document anonymization and semantic search engines. In this paper, we discuss the application of named entity recognition and linking (NEEL) to extract entities from Italian civil court judgments. To develop and evaluate our work, we use a corpus of 146 manually annotated court judgments. We use a pipeline that combines a transformer-based Named Entity Recognition (NER) component, a transformer-based Named Entity Linking (NEL) component, and a NIL prediction component. While the NEL and NIL prediction components are not fine-tuned on domain-specific data, the NER component is fine-tuned on the annotated corpus. In addition, we compare different masked language modeling (MLM) adaptation strategies to optimize the result and investigate their impact. Results obtained on a 30-document test set reveal satisfactory performance, especially on the NER task, and emphasize challenges to improve NEEL on similar documents. Our code is available on GitHub.(https://github.com/rpo19/pozzi_aixia_2023. We are not allowed to publish sensitive data and the NER models trained on sensitive data.)

Pozzi, R., Rubini, R., Bernasconi, C., Palmonari, M. (2023). Named Entity Recognition and Linking for Entity Extraction from Italian Civil Judgements. In AIxIA 2023 – Advances in Artificial Intelligence XXIInd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2023, Rome, Italy, November 6–9, 2023, Proceedings (pp.187-201). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-47546-7_13].

Named Entity Recognition and Linking for Entity Extraction from Italian Civil Judgements

Pozzi R.;Rubini R.;Bernasconi C.;Palmonari M.

2023

Abstract

The extraction of named entities from court judgments is useful in several downstream applications, such as document anonymization and semantic search engines. In this paper, we discuss the application of named entity recognition and linking (NEEL) to extract entities from Italian civil court judgments. To develop and evaluate our work, we use a corpus of 146 manually annotated court judgments. We use a pipeline that combines a transformer-based Named Entity Recognition (NER) component, a transformer-based Named Entity Linking (NEL) component, and a NIL prediction component. While the NEL and NIL prediction components are not fine-tuned on domain-specific data, the NER component is fine-tuned on the annotated corpus. In addition, we compare different masked language modeling (MLM) adaptation strategies to optimize the result and investigate their impact. Results obtained on a 30-document test set reveal satisfactory performance, especially on the NER task, and emphasize challenges to improve NEEL on similar documents. Our code is available on GitHub.(https://github.com/rpo19/pozzi_aixia_2023. We are not allowed to publish sensitive data and the NER models trained on sensitive data.)

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				Domain Adaptation; Italian Civil Court Judgments; Legal; Named Entity Linking; Named Entity Recognition; NIL Prediction;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				22nd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2023 - November 6–9, 2023
			
	Anno del convegno
	
				2023
			
	Curatori della monografia
	
				Basili, R; Lembo, D; Limongelli, C; Orlandini, A
			
	Titolo degli atti
	
				AIxIA 2023 – Advances in Artificial Intelligence
XXIInd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2023, Rome, Italy, November 6–9, 2023, Proceedings
			
	ISBN del volume degli atti
	
				9783031475450
			
	Collana o serie
	
				LECTURE NOTES IN COMPUTER SCIENCE
			
	Data ahead of print o Data prima pubblicazione Online
	
				2-nov-2023
			
	Data di pubblicazione
	
				2023
			
	Numero del volume
	
				14318 LNCS
			
	Pagina iniziale
	
				187
			
	Pagina finale
	
				201
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1007/978-3-031-47546-7_13
			
	Fulltext
	
				partially_open
			
	Citazione
	
				Pozzi, R., Rubini, R., Bernasconi, C., Palmonari, M. (2023). Named Entity Recognition and Linking for Entity Extraction from Italian Civil Judgements. In AIxIA 2023 – Advances in Artificial Intelligence
XXIInd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2023, Rome, Italy, November 6–9, 2023, Proceedings (pp.187-201). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-47546-7_13].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
Pozzi-2023-AIxIA-AAM.pdf accesso aperto Descrizione: AAM del paper Tipologia di allegato: Author’s Accepted Manuscript, AAM (Post-print) Licenza: Altro Dimensione 308.16 kB Formato Adobe PDF Visualizza/Apri	308.16 kB	Adobe PDF	Visualizza/Apri
Pozzi-2023-AIxIA-VoR.pdf Solo gestori archivio Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Tutti i diritti riservati Dimensione 312.89 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	312.89 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/462521

Citazioni

14

4

Social impact