This paper describes SEW-EMBED, our language-independent approach to multilingual and cross-lingual semantic word similarity as part of the SemEval-2017 Task 2. We leverage the Wikipedia-based concept representations developed by Raganato et al. (2016), and propose an embedded augmentation of their explicit high-dimensional vectors, which we obtain by plugging in an arbitrary word (or sense) embedding representation, and computing a weighted average in the continuous vector space. We evaluate SEW-EMBED with two different off-the-shelf embedding representations, and report their performances across all monolingual and cross-lingual benchmarks available for the task. Despite its simplicity, especially compared with supervised or overly tuned approaches, SEW-EMBED achieves competitive results in the cross-lingual setting (3rd best result in the global ranking of subtask 2, score 0.56).
Delli Bovi, C., Raganato, A. (2017). SEW-EMBED at SemEval-2017 Task 2: Language-Independent Concept Representations from a Semantically Enriched Wikipedia. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp.261-266). Association for Computational Linguistics.
SEW-EMBED at SemEval-2017 Task 2: Language-Independent Concept Representations from a Semantically Enriched Wikipedia
Raganato, Alessandro
2017
Abstract
This paper describes SEW-EMBED, our language-independent approach to multilingual and cross-lingual semantic word similarity as part of the SemEval-2017 Task 2. We leverage the Wikipedia-based concept representations developed by Raganato et al. (2016), and propose an embedded augmentation of their explicit high-dimensional vectors, which we obtain by plugging in an arbitrary word (or sense) embedding representation, and computing a weighted average in the continuous vector space. We evaluate SEW-EMBED with two different off-the-shelf embedding representations, and report their performances across all monolingual and cross-lingual benchmarks available for the task. Despite its simplicity, especially compared with supervised or overly tuned approaches, SEW-EMBED achieves competitive results in the cross-lingual setting (3rd best result in the global ranking of subtask 2, score 0.56).File | Dimensione | Formato | |
---|---|---|---|
S17-2041.pdf
Solo gestori archivio
Dimensione
352.05 kB
Formato
Adobe PDF
|
352.05 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.