Bicocca Open Archive

Neural language models are increasingly valued in computational psycholinguistics, due to their ability to provide conditional probability distributions over the lexicon that are predictive of human processing times. Given the vast array of available models, it is of both theoretical and methodological importance to assess what features of a model influence its psychometric quality. In this work we focus on parameter size, showing that larger Transformer-based language models generate probabilistic estimates that are less predictive of early eye-tracking measurements reflecting lexical access and early semantic integration. However, relatively bigger models show an advantage in capturing late eye-tracking measurements that reflect the full semantic and syntactic integration of a word into the current language context. Our results are supported by eye movement data in ten languages and consider four models, spanning from 564M to 4.5B parameters.

de Varda, A., Marelli, M. (2023). Scaling in Cognitive Modelling: a Multilingual Approach to Human Reading Times. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp.139-149). Association for Computational Linguistics (ACL) [10.18653/v1/2023.acl-short.14].

Scaling in Cognitive Modelling: a Multilingual Approach to Human Reading Times

de Varda, AG;Marelli, M

2023

Abstract

Neural language models are increasingly valued in computational psycholinguistics, due to their ability to provide conditional probability distributions over the lexicon that are predictive of human processing times. Given the vast array of available models, it is of both theoretical and methodological importance to assess what features of a model influence its psychometric quality. In this work we focus on parameter size, showing that larger Transformer-based language models generate probabilistic estimates that are less predictive of early eye-tracking measurements reflecting lexical access and early semantic integration. However, relatively bigger models show an advantage in capturing late eye-tracking measurements that reflect the full semantic and syntactic integration of a word into the current language context. Our results are supported by eye movement data in ten languages and consider four models, spanning from 564M to 4.5B parameters.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				cognitive modelling, inverse scaling, sentence reading, eyetracking
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 - 9 July 2023 through 14 July 2023
			
	Anno del convegno
	
				2023
			
	Titolo degli atti
	
				Proceedings of the Annual Meeting of the Association for Computational Linguistics
			
	ISBN del volume degli atti
	
				9781959429715
			
	Collana o serie
	
				PROCEEDINGS OF THE CONFERENCE - ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. MEETING
			
	Data di pubblicazione
	
				2023
			
	Numero del volume
	
				2
			
	Pagina iniziale
	
				139
			
	Pagina finale
	
				149
			
	DOI dell'intervento
	
				https://dx.doi.org/10.18653/v1/2023.acl-short.14
			
	Fulltext
	
				open
			
	Citazione
	
				de Varda, A., Marelli, M. (2023). Scaling in Cognitive Modelling: a Multilingual Approach to Human Reading Times. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp.139-149). Association for Computational Linguistics (ACL) [10.18653/v1/2023.acl-short.14].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
unpaywall-bitstream--2097782595.pdf accesso aperto Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 344.33 kB Formato Adobe PDF Visualizza/Apri	344.33 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/467169

Citazioni

27

25

Social impact