Bicocca Open Archive

The paper describes the creation of a manually validated dataset of Italian multiword expressions, building on candidates automatically extracted from corpora of written Italian. The main features of the resource, such as POS-pattern and lemma distribution, are also discussed, together with possible applications.

Masini, F., Micheli, M., Zaninello, A., Castagnoli, S., Nissim, M. (2020). Multiword Expressions We Live by: A Validated Usage-based Dataset from Corpora of Written Italian. In Proceedings of the Seventh Italian Conference on Computational Linguistics.

Multiword Expressions We Live by: A Validated Usage-based Dataset from Corpora of Written Italian

Masini, F^Primo;Micheli, M. S.;Zaninello, A.;Castagnoli, S.;Nissim, M.

2020

Abstract

The paper describes the creation of a manually validated dataset of Italian multiword expressions, building on candidates automatically extracted from corpora of written Italian. The main features of the resource, such as POS-pattern and lemma distribution, are also discussed, together with possible applications.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				Multiword expressions; Italian; corpora;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				Italian Conference on Computational Linguistics 2020
			
	Anno del convegno
	
				2020
			
	Curatori della monografia
	
				Monti, J.; Dell'Orletta, F.; Tamburini, F.
			
	Titolo degli atti
	
				Proceedings of the Seventh Italian Conference on Computational Linguistics
			
	Collana o serie
	
				CEUR WORKSHOP PROCEEDINGS
			
	Data di pubblicazione
	
				2020
			
	Numero del volume
	
				2769
			
	URL alternativo
	
				http://ceur-ws.org/Vol-2769/paper_33.pdf
			
	Fulltext
	
				none
			
	Citazione
	
				Masini, F., Micheli, M., Zaninello, A., Castagnoli, S., Nissim, M. (2020). Multiword Expressions We Live by: A Validated Usage-based Dataset from Corpora of Written Italian. In Proceedings of the Seventh Italian Conference on Computational Linguistics.
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/300508

Citazioni

0

ND

Social impact