According to psycholinguistic theories, processing a compound word (“snowman”) involves its automatic decomposition into its constituents (“snow”, “man”), then connected by an implicit semantic relation (“made of”) to obtain a plausible interpretation (“man made of snow”). However, the appropriate relation is often not univocal and must be selected from a set of competitors. In this study, we investigated whether contextualized word embeddings (cwe) capture human intuitions on compounds’ interpretations. We used BERT-base to obtain cwe of compounds in context (e.g., “We built a [snowman] in our garden”). Then, we systematically replaced compounds with paraphrase variants in which candidate relations were made explicit (e.g., “We built a [man made of snow] in our garden”). We then computed the similarity between the original compound cwe and its multiple variants. We find that these similarities predict participants’ interpretations (i.e., the probability of selecting a given relation) and their degree of conflict. Thus, we show that cwe can be leveraged to generate semantic representations for linguistic units that are not directly observable in text, but which influence compounds’ interpretation and processing.
Ciapparelli, M., Marelli, M. (2023). Modeling compound word relational interpretations with contextualized word embeddings. Intervento presentato a: Psycholinguistics in Flanders Conference, Ghent, Belgio.
Modeling compound word relational interpretations with contextualized word embeddings
Ciapparelli, M
Primo
;Marelli, MUltimo
2023
Abstract
According to psycholinguistic theories, processing a compound word (“snowman”) involves its automatic decomposition into its constituents (“snow”, “man”), then connected by an implicit semantic relation (“made of”) to obtain a plausible interpretation (“man made of snow”). However, the appropriate relation is often not univocal and must be selected from a set of competitors. In this study, we investigated whether contextualized word embeddings (cwe) capture human intuitions on compounds’ interpretations. We used BERT-base to obtain cwe of compounds in context (e.g., “We built a [snowman] in our garden”). Then, we systematically replaced compounds with paraphrase variants in which candidate relations were made explicit (e.g., “We built a [man made of snow] in our garden”). We then computed the similarity between the original compound cwe and its multiple variants. We find that these similarities predict participants’ interpretations (i.e., the probability of selecting a given relation) and their degree of conflict. Thus, we show that cwe can be leveraged to generate semantic representations for linguistic units that are not directly observable in text, but which influence compounds’ interpretation and processing.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.