Misogyny is a form of hate against women and has been spreading exponentially through the Web, especially in social media platforms. Hateful contents may be expressed through popular communication tools, like memes. A meme is an image characterised by a pictorial content with an overlaying text introduced a posteriori, and its main aim is originally to be funny and/or ironic. However, the use of memes to convey misogynous messages has increased and thus an automatic detection of these contents seems to be necessary to counteract this phenomenon. This task is particularly challenging, having that (1) different memes can present the same image, but different texts and vice versa, (2) two memes with the same image but different texts can convey a misogynous and not misogynous message, respectively, (3) misogyny can be expressed by image alone, text alone or their combination. In this paper both unimodal and multimodal approaches are investigated whose classifiers are trained and tested on a dataset of in the wild memes, which present both experts and perceived labels. The proposed multimodal approach provides better results compared to the unimodal ones and the VisualBERT state-of-the-art benchmark.
Fersini, E., Rizzi, G., Saibene, A., Gasparini, F. (2022). Misogynous MEME Recognition: A Preliminary Study. In Advances in Artificial Intelligence. AIxIA 2021 (pp.279-293) [10.1007/978-3-031-08421-8_19].
Misogynous MEME Recognition: A Preliminary Study
Fersini E.;Rizzi G.;Saibene A.;Gasparini F.
2022
Abstract
Misogyny is a form of hate against women and has been spreading exponentially through the Web, especially in social media platforms. Hateful contents may be expressed through popular communication tools, like memes. A meme is an image characterised by a pictorial content with an overlaying text introduced a posteriori, and its main aim is originally to be funny and/or ironic. However, the use of memes to convey misogynous messages has increased and thus an automatic detection of these contents seems to be necessary to counteract this phenomenon. This task is particularly challenging, having that (1) different memes can present the same image, but different texts and vice versa, (2) two memes with the same image but different texts can convey a misogynous and not misogynous message, respectively, (3) misogyny can be expressed by image alone, text alone or their combination. In this paper both unimodal and multimodal approaches are investigated whose classifiers are trained and tested on a dataset of in the wild memes, which present both experts and perceived labels. The proposed multimodal approach provides better results compared to the unimodal ones and the VisualBERT state-of-the-art benchmark.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.