The extraction of named entities from court judgments is useful in several downstream applications, such as document anonymization and semantic search engines. In this paper, we discuss the application of named entity recognition and linking (NEEL) to extract entities from Italian civil court judgments. To develop and evaluate our work, we use a corpus of 146 manually annotated court judgments. We use a pipeline that combines a transformer-based Named Entity Recognition (NER) component, a transformer-based Named Entity Linking (NEL) component, and a NIL prediction component. While the NEL and NIL prediction components are not fine-tuned on domain-specific data, the NER component is fine-tuned on the annotated corpus. In addition, we compare different masked language modeling (MLM) adaptation strategies to optimize the result and investigate their impact. Results obtained on a 30-document test set reveal satisfactory performance, especially on the NER task, and emphasize challenges to improve NEEL on similar documents. Our code is available on GitHub.(https://github.com/rpo19/pozzi_aixia_2023. We are not allowed to publish sensitive data and the NER models trained on sensitive data.)

Pozzi, R., Rubini, R., Bernasconi, C., Palmonari, M. (2023). Named Entity Recognition and Linking for Entity Extraction from Italian Civil Judgements. In AIxIA 2023 – Advances in Artificial Intelligence XXIInd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2023, Rome, Italy, November 6–9, 2023, Proceedings (pp.187-201). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-47546-7_13].

Named Entity Recognition and Linking for Entity Extraction from Italian Civil Judgements

Pozzi R.
;
Palmonari M.
2023

Abstract

The extraction of named entities from court judgments is useful in several downstream applications, such as document anonymization and semantic search engines. In this paper, we discuss the application of named entity recognition and linking (NEEL) to extract entities from Italian civil court judgments. To develop and evaluate our work, we use a corpus of 146 manually annotated court judgments. We use a pipeline that combines a transformer-based Named Entity Recognition (NER) component, a transformer-based Named Entity Linking (NEL) component, and a NIL prediction component. While the NEL and NIL prediction components are not fine-tuned on domain-specific data, the NER component is fine-tuned on the annotated corpus. In addition, we compare different masked language modeling (MLM) adaptation strategies to optimize the result and investigate their impact. Results obtained on a 30-document test set reveal satisfactory performance, especially on the NER task, and emphasize challenges to improve NEEL on similar documents. Our code is available on GitHub.(https://github.com/rpo19/pozzi_aixia_2023. We are not allowed to publish sensitive data and the NER models trained on sensitive data.)
paper
Domain Adaptation; Italian Civil Court Judgments; Legal; Named Entity Linking; Named Entity Recognition; NIL Prediction;
English
22nd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2023 - November 6–9, 2023
2023
Basili, R; Lembo, D; Limongelli, C; Orlandini, A
AIxIA 2023 – Advances in Artificial Intelligence XXIInd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2023, Rome, Italy, November 6–9, 2023, Proceedings
9783031475450
2-nov-2023
2023
14318 LNCS
187
201
open
Pozzi, R., Rubini, R., Bernasconi, C., Palmonari, M. (2023). Named Entity Recognition and Linking for Entity Extraction from Italian Civil Judgements. In AIxIA 2023 – Advances in Artificial Intelligence XXIInd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2023, Rome, Italy, November 6–9, 2023, Proceedings (pp.187-201). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-47546-7_13].
File in questo prodotto:
File Dimensione Formato  
Pozzi-2023-AIxIA-AAM.pdf

accesso aperto

Descrizione: AAM del paper
Tipologia di allegato: Author’s Accepted Manuscript, AAM (Post-print)
Licenza: Altro
Dimensione 308.16 kB
Formato Adobe PDF
308.16 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/462521
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
Social impact