Bicocca Open Archive

This paper is an extended description of SemEval-2014 Task 1, the task on the evaluation of Compositional Distributional Semantics Models on full sentences. Systems participating in the task were presented with pairs of sentences and were evaluated on their ability to predict human judgments on (1) semantic relatedness and (2) entailment. Training and testing data were subsets of the SICK (Sentences Involving Compositional Knowledge) data set. SICK was developed with the aim of providing a proper benchmark to evaluate compositional semantic systems, though task participation was open to systems based on any approach. Taking advantage of the SemEval experience, in this paper we analyze the SICK data set, in order to evaluate the extent to which it meets its design goal and to shed light on the linguistic phenomena that are still challenging for state-of-the-art computational semantic systems. Qualitative and quantitative error analyses show that many systems are quite sensitive to changes in the proportion of sentence pair types, and degrade in the presence of additional lexico-syntactic complexities which do not affect human judgements. More compositional systems seem to perform better when the task proportions are changed, but the effect needs further confirmation.

Bentivogli, L., Bernardi, R., Marelli, M., Menini, S., Baroni, M., Zamparelli, R. (2016). SICK through the SemEval glasses. Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment. LANGUAGE RESOURCES AND EVALUATION, 50(1), 95-124 [10.1007/s10579-015-9332-5].

SICK through the SemEval glasses. Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment

Bentivogli, L;Bernardi, R;MARELLI, MARCO;Menini, S;Baroni, M;Zamparelli, R.

2016

Abstract

This paper is an extended description of SemEval-2014 Task 1, the task on the evaluation of Compositional Distributional Semantics Models on full sentences. Systems participating in the task were presented with pairs of sentences and were evaluated on their ability to predict human judgments on (1) semantic relatedness and (2) entailment. Training and testing data were subsets of the SICK (Sentences Involving Compositional Knowledge) data set. SICK was developed with the aim of providing a proper benchmark to evaluate compositional semantic systems, though task participation was open to systems based on any approach. Taking advantage of the SemEval experience, in this paper we analyze the SICK data set, in order to evaluate the extent to which it meets its design goal and to shed light on the linguistic phenomena that are still challenging for state-of-the-art computational semantic systems. Qualitative and quantitative error analyses show that many systems are quite sensitive to changes in the proportion of sentence pair types, and degrade in the presence of additional lexico-syntactic complexities which do not affect human judgements. More compositional systems seem to perform better when the task proportions are changed, but the effect needs further confirmation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
			Articolo in rivista - Articolo scientifico
		
	Parole chiave
	
			Compositionality; Computational semantics; Distributional semantics models;
		
	Parole chiave
	
			Compositionality; Computational semantics; Distributional semantics models; Linguistics and Language
		
	Lingua del contenuto
	
			English
		
	Data di pubblicazione
	
			2016
		
	Rivista
	
			LANGUAGE RESOURCES AND EVALUATION
		
	Numero del volume
	
			50
		
	Fascicolo
	
			1
		
	Pagina iniziale
	
			95
		
	Pagina finale
	
			124
		
	DOI dell'articolo
	
			https://dx.doi.org/10.1007/s10579-015-9332-5
		
	Fulltext
	
			none
		
	Citazione
	
			Bentivogli, L., Bernardi, R., Marelli, M., Menini, S., Baroni, M., Zamparelli, R. (2016). SICK through the SemEval glasses. Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment. LANGUAGE RESOURCES AND EVALUATION, 50(1), 95-124 [10.1007/s10579-015-9332-5].
		
	Appare nelle tipologie:
	
			01 - Articolo su rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/141682

Citazioni

32

20

Social impact