Bicocca Open Archive

The task of Named Entity Recognition (NER) is aimed at identifying named entities in a given text and classifying them into pre-defined domain entity types such as persons, organizations, locations. Most of the existing NER systems make use of generic entity type classification schemas, however, the comparison and integration of (more or less) different entity types among different NER systems is a complex problem even for human experts. In this paper, we propose a supervised approach called L2AWE (Learning To Adapt with Word Embeddings) which aims at adapting a NER system trained on a source classification schema to a given target one. In particular, we validate the hypothesis that the embedding representation of named entities can improve the semantic meaning of the feature space used to perform the adaptation from a source to a target domain. The results obtained on benchmark datasets of informal text show that L2AWE not only outperforms several state of the art models, but it is also able to tackle errors and uncertainties given by NER systems.

Nozza, D., Manchanda, P., Fersini, E., Palmonari, M., Messina, E. (2021). LearningToAdapt with word embeddings: Domain adaptation of Named Entity Recognition systems. INFORMATION PROCESSING & MANAGEMENT, 58(3 (May 2021)) [10.1016/j.ipm.2021.102537].

LearningToAdapt with word embeddings: Domain adaptation of Named Entity Recognition systems

Nozza D.;Manchanda P.;Fersini E.;Palmonari M.;Messina Enza

2021

Abstract

The task of Named Entity Recognition (NER) is aimed at identifying named entities in a given text and classifying them into pre-defined domain entity types such as persons, organizations, locations. Most of the existing NER systems make use of generic entity type classification schemas, however, the comparison and integration of (more or less) different entity types among different NER systems is a complex problem even for human experts. In this paper, we propose a supervised approach called L2AWE (Learning To Adapt with Word Embeddings) which aims at adapting a NER system trained on a source classification schema to a given target one. In particular, we validate the hypothesis that the embedding representation of named entities can improve the semantic meaning of the feature space used to perform the adaptation from a source to a target domain. The results obtained on benchmark datasets of informal text show that L2AWE not only outperforms several state of the art models, but it is also able to tackle errors and uncertainties given by NER systems.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Domain adaptation; Named Entity Recognition; Word embeddings;
			
	Lingua del contenuto
	
				English
			
	Data ahead of print o Data prima pubblicazione Online
	
				19-feb-2021
			
	Data di pubblicazione
	
				2021
			
	Rivista
	
				INFORMATION PROCESSING & MANAGEMENT
			
	Numero del volume
	
				58
			
	Fascicolo
	
				3 (May 2021)
			
	Article number
	
				102537
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1016/j.ipm.2021.102537
			
	Fulltext
	
				open
			
	Citazione
	
				Nozza, D., Manchanda, P., Fersini, E., Palmonari, M., Messina, E. (2021). LearningToAdapt with word embeddings: Domain adaptation of Named Entity Recognition systems. INFORMATION PROCESSING & MANAGEMENT, 58(3 (May 2021)) [10.1016/j.ipm.2021.102537].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
L2A_with_Word_Embeddings (1).pdf accesso aperto Tipologia di allegato: Submitted Version (Pre-print) Dimensione 930.31 kB Formato Adobe PDF Visualizza/Apri	930.31 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/305636

Citazioni

43

30

Social impact