Corpus-based word frequencies are one of the most important predictors in language processing tasks. Frequencies based on conversational corpora (such as movie subtitles) are shown to better capture the variance in lexical decision tasks compared to traditional corpora. In this study, we show that frequencies computed from social media are currently the best frequency-based estimators of lexical decision reaction times (up to 3.6% increase in explained variance). The results are robust (observed for Twitter- and Facebook-based frequencies on American English and British English datasets) and are still substantial when we control for corpus size

Herdağdelen, A., Marelli, M. (2017). Social Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition. COGNITIVE SCIENCE, 41(4), 976-995 [10.1111/cogs.12392].

Social Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition

MARELLI, MARCO
Ultimo
2017

Abstract

Corpus-based word frequencies are one of the most important predictors in language processing tasks. Frequencies based on conversational corpora (such as movie subtitles) are shown to better capture the variance in lexical decision tasks compared to traditional corpora. In this study, we show that frequencies computed from social media are currently the best frequency-based estimators of lexical decision reaction times (up to 3.6% increase in explained variance). The results are robust (observed for Twitter- and Facebook-based frequencies on American English and British English datasets) and are still substantial when we control for corpus size
Articolo in rivista - Articolo scientifico
Frequency effects; Lexical decision; Social media; Text corpora; Language and Linguistics; Experimental and Cognitive Psychology; Cognitive Neuroscience; Artificial Intelligence
English
2017
41
4
976
995
none
Herdağdelen, A., Marelli, M. (2017). Social Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition. COGNITIVE SCIENCE, 41(4), 976-995 [10.1111/cogs.12392].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/141662
Citazioni
  • Scopus 41
  • ???jsp.display-item.citation.isi??? 37
Social impact