Bicocca Open Archive

It is well known that the surprisal of an upcoming word, as estimated by language models, isa solid predictor of reading times (Smith andLevy, 2013). However, most of the studiesthat support this view are based on English andfew other Germanic languages, leaving an openquestion as to the cross-lingual generalizability of such findings. Moreover, they tend toconsider only the best-performing eye-trackingmeasure, which might conflate the effects ofpredictive and integrative processing. Furthermore, it is not clear whether prediction plays arole in non-native language processing in bilingual individuals (Grüter et al., 2014). We approach these problems at large scale, extracting surprisal estimates from mBERT, and assessing their psychometric predictive power onthe MECO corpus, a cross-linguistic dataset ofeye movement behavior in reading (Siegelmanet al., 2022; Kuperman et al., 2020). We showthat surprisal is a strong predictor of readingtimes across languages and fixation measurements, and that its effects in L2 are weaker withrespect to L1.

De Varda, A., Marelli, M. (2022). The Effects of Surprisal across Languages: Results from Native and Non-native Reading. In 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing - Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022 (pp.138-141). Association for Computational Linguistics (ACL).

The Effects of Surprisal across Languages: Results from Native and Non-native Reading

De Varda, A;Marelli, M

2022

Abstract

It is well known that the surprisal of an upcoming word, as estimated by language models, isa solid predictor of reading times (Smith andLevy, 2013). However, most of the studiesthat support this view are based on English andfew other Germanic languages, leaving an openquestion as to the cross-lingual generalizability of such findings. Moreover, they tend toconsider only the best-performing eye-trackingmeasure, which might conflate the effects ofpredictive and integrative processing. Furthermore, it is not clear whether prediction plays arole in non-native language processing in bilingual individuals (Grüter et al., 2014). We approach these problems at large scale, extracting surprisal estimates from mBERT, and assessing their psychometric predictive power onthe MECO corpus, a cross-linguistic dataset ofeye movement behavior in reading (Siegelmanet al., 2022; Kuperman et al., 2020). We showthat surprisal is a strong predictor of readingtimes across languages and fixation measurements, and that its effects in L2 are weaker withrespect to L1.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				suprisal, sentence processing, multilanguage models
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, AACL-IJCNLP 2022 - 20 November 2022 through 23 November 2022
			
	Anno del convegno
	
				2022
			
	Titolo degli atti
	
				2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing - Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022
			
	ISBN del volume degli atti
	
				9781959429043
			
	Data di pubblicazione
	
				2022
			
	Pagina iniziale
	
				138
			
	Pagina finale
	
				141
			
	Fulltext
	
				none
			
	Citazione
	
				De Varda, A., Marelli, M. (2022). The Effects of Surprisal across Languages: Results from Native and Non-native Reading. In 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing - Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022 (pp.138-141). Association for Computational Linguistics (ACL).
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/467168

Citazioni

18

ND

Social impact