Bicocca Open Archive

The automated identification of national implementations (NIMs) of European directives by text similarity techniques has shown promising preliminary results. Previous works have proposed and utilized unsupervised lexical and semantic similarity techniques based on vector space models, latent semantic analysis and topic models. However, these techniques were evaluated on a small multilingual corpus of directives and NIMs. In this paper, we utilize word and paragraph embedding models learned by shallow neural networks from a multilingual legal corpus of European directives and national legislation (from Ireland, Luxembourg and Italy) to develop unsupervised semantic similarity systems to identify transpositions. We evaluate these models and compare their results with the previous unsupervised methods on a multilingual test corpus of 43 Directives and their corresponding NIMs. We also develop supervised machine learning models to identify transpositions and compare their performance with different feature sets.

Nanda, R., Siragusa, G., Di Caro, L., Boella, G., Grossio, L., Gerbaudo, M., et al. (2019). Unsupervised and supervised text similarity systems for automated identification of national implementing measures of European directives. ARTIFICIAL INTELLIGENCE AND LAW, 27(2), 199-225 [10.1007/s10506-018-9236-y].

Unsupervised and supervised text similarity systems for automated identification of national implementing measures of European directives

Nanda R.;Siragusa G.;Di Caro L.;Boella G.;Grossio L.;Gerbaudo M.;Costamagna F.

2019

Abstract

The automated identification of national implementations (NIMs) of European directives by text similarity techniques has shown promising preliminary results. Previous works have proposed and utilized unsupervised lexical and semantic similarity techniques based on vector space models, latent semantic analysis and topic models. However, these techniques were evaluated on a small multilingual corpus of directives and NIMs. In this paper, we utilize word and paragraph embedding models learned by shallow neural networks from a multilingual legal corpus of European directives and national legislation (from Ireland, Luxembourg and Italy) to develop unsupervised semantic similarity systems to identify transpositions. We evaluate these models and compare their results with the previous unsupervised methods on a multilingual test corpus of 43 Directives and their corresponding NIMs. We also develop supervised machine learning models to identify transpositions and compare their performance with different feature sets.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Machine learning; Text similarity; Transposition;
			
	Parole chiave
	
				Machine learning; Text similarity; Transposition
			
	Lingua del contenuto
	
				English
			
	Data di pubblicazione
	
				2019
			
	Rivista
	
				ARTIFICIAL INTELLIGENCE AND LAW
			
	Numero del volume
	
				27
			
	Fascicolo
	
				2
			
	Pagina iniziale
	
				199
			
	Pagina finale
	
				225
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1007/s10506-018-9236-y
			
	Fulltext
	
				none
			
	Citazione
	
				Nanda, R., Siragusa, G., Di Caro, L., Boella, G., Grossio, L., Gerbaudo, M., et al. (2019). Unsupervised and supervised text similarity systems for automated identification of national implementing measures of European directives. ARTIFICIAL INTELLIGENCE AND LAW, 27(2), 199-225 [10.1007/s10506-018-9236-y].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/295844

Citazioni

33

21

Social impact