Bicocca Open Archive

In this paper we present an automatic multilingual annotation of theWikipedia dumps in two languages, with both word senses (i.e. concepts) and named entities. We use Babelfy 1.0, a state-of-the-art multilingual Word Sense Disambiguation and Entity Linking system. As its reference inventory, Babelfy draws upon BabelNet 3.0, a very large multilingual encyclopedic dictionary and semantic network which connects concepts and named entities in 271 languages from different inventories, such as WordNet, Open Multilingual WordNet, Wikipedia, OmegaWiki, Wiktionary and Wikidata. In addition, we perform both an automatic evaluation of the dataset and a language-specific statistical analysis. In detail, we investigate the word sense distributions by part-of-speech and language, together with the similarity of the annotated entities and concepts for a random sample of interlinked Wikipedia pages in different languages. The annotated corpora are available at http://lcl.uniroma1.it/babelfied-wikipedia/.

Scozzafava, F., Raganato, A., Moro, A., Navigli, R. (2015). Automatic identification and disambiguation of concepts and named entities in the multilingual wikipedia. In AI*IA 2015 Advances in Artificial Intelligence (pp.357-366). Springer Verlag [10.1007/978-3-319-24309-2_27].

Automatic identification and disambiguation of concepts and named entities in the multilingual wikipedia

Scozzafava F;Raganato A;Moro A;Navigli R

2015

Abstract

In this paper we present an automatic multilingual annotation of theWikipedia dumps in two languages, with both word senses (i.e. concepts) and named entities. We use Babelfy 1.0, a state-of-the-art multilingual Word Sense Disambiguation and Entity Linking system. As its reference inventory, Babelfy draws upon BabelNet 3.0, a very large multilingual encyclopedic dictionary and semantic network which connects concepts and named entities in 271 languages from different inventories, such as WordNet, Open Multilingual WordNet, Wikipedia, OmegaWiki, Wiktionary and Wikidata. In addition, we perform both an automatic evaluation of the dataset and a language-specific statistical analysis. In detail, we investigate the word sense distributions by part-of-speech and language, together with the similarity of the annotated entities and concepts for a random sample of interlinked Wikipedia pages in different languages. The annotated corpora are available at http://lcl.uniroma1.it/babelfied-wikipedia/.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				Corpus annotation; Disambiguation; Entity linking; Multilinguality; Named entities; Semantic annotation; Sense annotation; Word sense disambiguation; Word senses;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				14th International Conference of the Italian Association for Artificial Intelligence (AIXIA) 23 September 2015 through 25 September 2015
			
	Anno del convegno
	
				2015
			
	Curatori della monografia
	
				Riguzzi, F; Gavanelli, M; Lamma, E
			
	Titolo degli atti
	
				AI*IA 2015 Advances in Artificial Intelligence
			
	ISBN del volume degli atti
	
				978-3-319-24308-5
			
	Collana o serie
	
				LECTURE NOTES IN COMPUTER SCIENCE
			
	Data di pubblicazione
	
				2015
			
	Numero del volume
	
				9336
			
	Pagina iniziale
	
				357
			
	Pagina finale
	
				366
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1007/978-3-319-24309-2_27
			
	Fulltext
	
				reserved
			
	Citazione
	
				Scozzafava, F., Raganato, A., Moro, A., Navigli, R. (2015). Automatic identification and disambiguation of concepts and named entities in the multilingual wikipedia. In AI*IA 2015 Advances in Artificial Intelligence (pp.357-366). Springer Verlag [10.1007/978-3-319-24309-2_27].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
Scozzafava_Automatic_2015.pdf Solo gestori archivio Dimensione 479.77 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	479.77 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/361543

Citazioni

9

6

Social impact