Irony is a complex linguistic phenomenon that has been extensively studied in computational linguistics across many languages. Existing research has relied heavily on annotated corpora, which are inherently biased due to their creation process. This study focuses on the problem of bias in cross-domain and cross-language irony detection and aims to identify the extent of topic bias in benchmark corpora and how it affects the generalization of models across domains and languages (English, Spanish, and Italian). Our findings offer a first insight into this issue and showed that mitigating the topic bias in these corpora improves the generalization of models beyond their training data. These results have important implications for the development of robust models in the analysis of ironic language.

Ortega-Bueno, R., Rosso, P., Fersini, E. (2023). Cross-Domain and Cross-Language Irony Detection: The Impact of Bias on Models’ Generalization. In Natural Language Processing and Information Systems 28th International Conference on Applications of Natural Language to Information Systems, NLDB 2023, Derby, UK, June 21–23, 2023, Proceedings (pp.140-155). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-35320-8_10].

Cross-Domain and Cross-Language Irony Detection: The Impact of Bias on Models’ Generalization

Fersini E.
2023

Abstract

Irony is a complex linguistic phenomenon that has been extensively studied in computational linguistics across many languages. Existing research has relied heavily on annotated corpora, which are inherently biased due to their creation process. This study focuses on the problem of bias in cross-domain and cross-language irony detection and aims to identify the extent of topic bias in benchmark corpora and how it affects the generalization of models across domains and languages (English, Spanish, and Italian). Our findings offer a first insight into this issue and showed that mitigating the topic bias in these corpora improves the generalization of models beyond their training data. These results have important implications for the development of robust models in the analysis of ironic language.
paper
Bias; Bias Mitigation; Cross-Domain; Cross-Language; Irony
English
28th International Conference on Applications of Natural Language to Information Systems, NLDB 2023
2023
Métais, E; Meziane, F; Sugumaran, V; Manning, W; Reiff-Marganiec, S
Natural Language Processing and Information Systems 28th International Conference on Applications of Natural Language to Information Systems, NLDB 2023, Derby, UK, June 21–23, 2023, Proceedings
9783031353192
2023
13913 LNCS
140
155
none
Ortega-Bueno, R., Rosso, P., Fersini, E. (2023). Cross-Domain and Cross-Language Irony Detection: The Impact of Bias on Models’ Generalization. In Natural Language Processing and Information Systems 28th International Conference on Applications of Natural Language to Information Systems, NLDB 2023, Derby, UK, June 21–23, 2023, Proceedings (pp.140-155). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-35320-8_10].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/451401
Citazioni
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 2
Social impact