Bicocca Open Archive

Massively multilingual models such as mBERT and XLM-R are increasingly valued in Natural Language Processing research and applications, due to their ability to tackle the uneven distribution of resources available for different languages. The models’ ability to process multiple languages relying on a shared set of parameters raises the question of whether the grammatical knowledge they extracted during pre-training can be considered as a data-driven cross-lingual grammar. The present work studies the inner workings of mBERT and XLM-R in order to test the cross-lingual consistency of the individual neural units that respond to a precise syntactic phenomenon, that is, number agreement, in five languages (English, German, French, Hebrew, Russian). We found that there is a significant overlap in the latent dimensions that encode agreement across the languages we considered. This overlap is larger (a) for long-vis-à-vis shortdistance agreement and (b) when considering XLM-R as compared to mBERT, and peaks in the intermediate layers of the network. We further show that a small set of syntax-sensitive neurons can capture agreement violations across languages; however, their contribution is not decisive in agreement processing.

de Varda, A., Marelli, M. (2023). Data-driven Cross-lingual Syntax: An Agreement Study with Massively Multilingual Models. COMPUTATIONAL LINGUISTICS, 49(2), 261-299 [10.1162/coli_a_00472].

Data-driven Cross-lingual Syntax: An Agreement Study with Massively Multilingual Models

de Varda, AG;Marelli, M

2023

Abstract

Massively multilingual models such as mBERT and XLM-R are increasingly valued in Natural Language Processing research and applications, due to their ability to tackle the uneven distribution of resources available for different languages. The models’ ability to process multiple languages relying on a shared set of parameters raises the question of whether the grammatical knowledge they extracted during pre-training can be considered as a data-driven cross-lingual grammar. The present work studies the inner workings of mBERT and XLM-R in order to test the cross-lingual consistency of the individual neural units that respond to a precise syntactic phenomenon, that is, number agreement, in five languages (English, German, French, Hebrew, Russian). We found that there is a significant overlap in the latent dimensions that encode agreement across the languages we considered. This overlap is larger (a) for long-vis-à-vis shortdistance agreement and (b) when considering XLM-R as compared to mBERT, and peaks in the intermediate layers of the network. We further show that a small set of syntax-sensitive neurons can capture agreement violations across languages; however, their contribution is not decisive in agreement processing.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				syntax, large language models, multilingual models
			
	Lingua del contenuto
	
				English
			
	Data di pubblicazione
	
				2023
			
	Rivista
	
				COMPUTATIONAL LINGUISTICS
			
	Numero del volume
	
				49
			
	Fascicolo
	
				2
			
	Pagina iniziale
	
				261
			
	Pagina finale
	
				299
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1162/coli_a_00472
			
	Fulltext
	
				open
			
	Citazione
	
				de Varda, A., Marelli, M. (2023). Data-driven Cross-lingual Syntax: An Agreement Study with Massively Multilingual Models. COMPUTATIONAL LINGUISTICS, 49(2), 261-299 [10.1162/coli_a_00472].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
10281-467126_VoR.pdf accesso aperto Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 7.15 MB Formato Adobe PDF Visualizza/Apri	7.15 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/467126

Citazioni

9

5

Social impact