In the present study, we provide a comprehensive analysis and a multi-dimensional dataset of semantic transparency measures for 1810 German compound words. Compound words are considered semantically transparent when the contribution of the constituents’ meaning to the compound meaning is clear (as in airport), but the degree of semantic transparency varies between compounds (compare strawberry or sandman). Our dataset includes both compositional and relatedness-based semantic transparency measures, also differentiated by constituents. The measures are obtained from a computational and fully implemented semantic model based on distributional semantics. We validate the measures using data from four behavioral experiments: Explicit transparency ratings, two different lexical decision tasks using different nonwords, and an eye-tracking study. We demonstrate that different semantic effects emerge in different behavioral tasks, which can only be captured using a multi-dimensional approach to semantic transparency. We further provide the semantic transparency measures derived from the model for a dataset of 40,475 additional German compounds, as well as for 2061 novel German compounds.
Gunther, F., Marelli, M., Bolte, J. (2020). Semantic transparency effects in German compounds: A large dataset and multiple-task investigation. BEHAVIOR RESEARCH METHODS, 52(3), 1208-1224 [10.3758/s13428-019-01311-4].
Semantic transparency effects in German compounds: A large dataset and multiple-task investigation
Marelli M.;
2020
Abstract
In the present study, we provide a comprehensive analysis and a multi-dimensional dataset of semantic transparency measures for 1810 German compound words. Compound words are considered semantically transparent when the contribution of the constituents’ meaning to the compound meaning is clear (as in airport), but the degree of semantic transparency varies between compounds (compare strawberry or sandman). Our dataset includes both compositional and relatedness-based semantic transparency measures, also differentiated by constituents. The measures are obtained from a computational and fully implemented semantic model based on distributional semantics. We validate the measures using data from four behavioral experiments: Explicit transparency ratings, two different lexical decision tasks using different nonwords, and an eye-tracking study. We demonstrate that different semantic effects emerge in different behavioral tasks, which can only be captured using a multi-dimensional approach to semantic transparency. We further provide the semantic transparency measures derived from the model for a dataset of 40,475 additional German compounds, as well as for 2061 novel German compounds.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.