Bicocca Open Archive

Named entity extraction is a crucial task to support the population of Knowledge Bases (KBs) from documents written in natural language. However, in many application domains, these documents must be collected and processed incrementally to update the KB as more data are ingested. In some cases, quality concerns may even require human validation mechanisms along the process. While very recent work in the NLP community has discussed the importance of evaluating and benchmarking continuous entity extraction, it has proposed methods and datasets that avoid Named Entity Linking (NEL) as a component of the extraction process. In this paper, we advocate for batch-based incremental entity extraction methods that can exploit NEL with a background KB, detect mentions of entities that are not present in the KB yet (NIL mentions), and update the KB with the novel entities. Based on this assumption, we present a methodology to evaluate NEL-based incremental entity extraction, which can be applied to a “static” dataset for evaluating NEL into a dataset for evaluating incremental entity extraction. We apply this methodology to an existing benchmark for evaluating NEL algorithms, and evaluate an incremental extraction pipeline that orchestrates different strong state-of-the-art and baseline algorithms for the tasks involved in the extraction process, namely, NEL, NIL prediction, and NIL clustering. In presenting our experiments, we demonstrate the increased difficulty of the information extraction task in incremental settings and discuss the strengths of the available solutions as well as open challenges.

Pozzi, R., Moiraghi Motta, F., Lodi, F., Palmonari, M. (2022). Evaluation of Incremental Entity Extraction with Background Knowledge and Entity Linking. In IJCKG '22: Proceedings of the 11th International Joint Conference on Knowledge Graphs (pp.30-38). New York, NY : Association for Computing Machinery [10.1145/3579051.3579063].

Evaluation of Incremental Entity Extraction with Background Knowledge and Entity Linking

Pozzi, R;Moiraghi Motta, F;Lodi, F;Palmonari, M

2022

Abstract

Named entity extraction is a crucial task to support the population of Knowledge Bases (KBs) from documents written in natural language. However, in many application domains, these documents must be collected and processed incrementally to update the KB as more data are ingested. In some cases, quality concerns may even require human validation mechanisms along the process. While very recent work in the NLP community has discussed the importance of evaluating and benchmarking continuous entity extraction, it has proposed methods and datasets that avoid Named Entity Linking (NEL) as a component of the extraction process. In this paper, we advocate for batch-based incremental entity extraction methods that can exploit NEL with a background KB, detect mentions of entities that are not present in the KB yet (NIL mentions), and update the KB with the novel entities. Based on this assumption, we present a methodology to evaluate NEL-based incremental entity extraction, which can be applied to a “static” dataset for evaluating NEL into a dataset for evaluating incremental entity extraction. We apply this methodology to an existing benchmark for evaluating NEL algorithms, and evaluate an incremental extraction pipeline that orchestrates different strong state-of-the-art and baseline algorithms for the tasks involved in the extraction process, namely, NEL, NIL prediction, and NIL clustering. In presenting our experiments, we demonstrate the increased difficulty of the information extraction task in incremental settings and discuss the strengths of the available solutions as well as open challenges.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				Entity Extraction; Incremental Entity Extraction; Knowledge Base Population; Named Entity Linking;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				The 11th International Joint Conference on Knowledge Graphs (IJCKG’22) - 27 October 2022 - 28 October 2022
			
	Anno del convegno
	
				2022
			
	Curatori della monografia
	
				Artale, A; Calvanese, D; Wang, H; Zhang, X
			
	Titolo degli atti
	
				IJCKG '22: Proceedings of the 11th International Joint Conference on Knowledge Graphs
			
	ISBN del volume degli atti
	
				9781450399876
			
	Data di pubblicazione
	
				2022
			
	Numero del volume
	
				11
			
	Pagina iniziale
	
				30
			
	Pagina finale
	
				38
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1145/3579051.3579063
			
	URL alternativo
	
				http://ijckg.org/2022/papers/IJCKG_2022_paper_3501.pdf
			
	Fulltext
	
				reserved
			
	Citazione
	
				Pozzi, R., Moiraghi Motta, F., Lodi, F., Palmonari, M. (2022). Evaluation of Incremental Entity Extraction with Background Knowledge and Entity Linking. In IJCKG '22: Proceedings of the 11th International Joint Conference on Knowledge Graphs (pp.30-38). New York, NY : Association for Computing Machinery [10.1145/3579051.3579063].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
Pozzi-2022-IJCKG2022-VoR.pdf Solo gestori archivio Descrizione: Intervento a convegno Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Tutti i diritti riservati Dimensione 743.52 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	743.52 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/423118

Citazioni

2

1

Social impact