Bicocca Open Archive

Cross-lingual data linking is the problem of establishing links between resources, such as places, services, or movies, which are described in different languages. In cross-lingual data linking it is often the case that very short descriptions have to be matched, which makes the problem even more challenging. This work presents a method named TRanslation-based Explicit Semantic Analysis (TR-ESA) to represent and match short textual descriptions available in different languages. TR-ESA translates short descriptions in any given language into a pivot language by exploiting a machine translation tool. Then, it generates a Wikipedia-based representation of the translated text by using the Explicit Semantic Analysis technique. The resulting representations are used to match short descriptions in different languages. The method is incorporated in CroSeR (Cross-lingual Service Retrieval), an interactive data linking tool that recommends potential matches to users. We compared results coming from an in-vitro evaluation on a gold standard consisting of five datasets in different languages, with an in-vivo experiment that involved human experts supported by CroSeR. The in-vivo evaluation confirmed the results of the in-vitro evaluation and the overall effectiveness of the proposed method.

Narducci, F., Palmonari, M., Semeraro, G. (2017). Cross-lingual link discovery with TR-ESA. INFORMATION SCIENCES, 394-395, 68-87 [10.1016/j.ins.2017.02.019].

Cross-lingual link discovery with TR-ESA

Narducci, F;PALMONARI, MATTEO LUIGI;Semeraro, G.

2017

Abstract

Cross-lingual data linking is the problem of establishing links between resources, such as places, services, or movies, which are described in different languages. In cross-lingual data linking it is often the case that very short descriptions have to be matched, which makes the problem even more challenging. This work presents a method named TRanslation-based Explicit Semantic Analysis (TR-ESA) to represent and match short textual descriptions available in different languages. TR-ESA translates short descriptions in any given language into a pivot language by exploiting a machine translation tool. Then, it generates a Wikipedia-based representation of the translated text by using the Explicit Semantic Analysis technique. The resulting representations are used to match short descriptions in different languages. The method is incorporated in CroSeR (Cross-lingual Service Retrieval), an interactive data linking tool that recommends potential matches to users. We compared results coming from an in-vitro evaluation on a gold standard consisting of five datasets in different languages, with an in-vivo experiment that involved human experts supported by CroSeR. The in-vivo evaluation confirmed the results of the in-vitro evaluation and the overall effectiveness of the proposed method.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Cross-lingual data linking; Cross-lingual matching; Wikipedia;
			
	Parole chiave
	
				Cross-lingual data linking; Cross-lingual matching; Wikipedia
			
	Lingua del contenuto
	
				English
			
	Data di pubblicazione
	
				2017
			
	Rivista
	
				INFORMATION SCIENCES
			
	Numero del volume
	
				394-395
			
	Pagina iniziale
	
				68
			
	Pagina finale
	
				87
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1016/j.ins.2017.02.019
			
	Fulltext
	
				partially_open
			
	Citazione
	
				Narducci, F., Palmonari, M., Semeraro, G. (2017). Cross-lingual link discovery with TR-ESA. INFORMATION SCIENCES, 394-395, 68-87 [10.1016/j.ins.2017.02.019].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
TR-ESA-Pre-print.pdf Accesso Aperto Descrizione: Post-print version for self-archiving purposes Tipologia di allegato: Author’s Accepted Manuscript, AAM (Post-print) Dimensione 1.97 MB Formato Adobe PDF Visualizza/Apri	1.97 MB	Adobe PDF	Visualizza/Apri
+J2017-InfSciences-publisherVersion.pdf Solo gestori archivio Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Dimensione 1.67 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.67 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/155104

Citazioni

7

3

Social impact