This paper introduces MAMITA, a novel Italian multimodal benchmark dataset developed for the automatic detection of misogynistic content in online media, with a specific focus on memes. The dataset comprises 1880 memes sourced from popular social platforms—Facebook, Twitter, Instagram, Reddit—and meme-centric websites, selected using misogyny-related keywords covering a wide range of manifestations including body shaming, stereotyping, objectification, and violence. A key feature of this benchmark is its dual annotation strategy: all memes were independently labeled by both domain experts and a pool of 232 crowd annotators. This approach resulted in two parallel sets of annotations that reflect differing labeling perspectives. For each meme, labels include a binary classification (misogynistic or not), the type of misogyny, and its intensity. Beyond categorical labels, the dataset incorporates perspectivist metadata, capturing individual annotators’ perceptions of misogyny along with their demographic and socio-cultural background, including age, level of education, and social status. Each meme’s textual content was also automatically transcribed to enable multimodal analysis. This enriched benchmark enables nuanced research on the automatic detection of misogynistic content in online social media and supports investigations into how perceived misogyny varies across annotator profiles, allowing us to address the urgent challenge related to the diffusion of hateful content against women. Warning: this paper includes examples that may be offensive or harmful.

Fersini, E., Gasparini, F., Rizzi, G., Saibene, A. (2025). MAMITA: Benchmarking Misogyny in Italian Memes. In Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025). CEUR-WS.

MAMITA: Benchmarking Misogyny in Italian Memes

Fersini, E;Gasparini, F;Rizzi, G;Saibene, A
2025

Abstract

This paper introduces MAMITA, a novel Italian multimodal benchmark dataset developed for the automatic detection of misogynistic content in online media, with a specific focus on memes. The dataset comprises 1880 memes sourced from popular social platforms—Facebook, Twitter, Instagram, Reddit—and meme-centric websites, selected using misogyny-related keywords covering a wide range of manifestations including body shaming, stereotyping, objectification, and violence. A key feature of this benchmark is its dual annotation strategy: all memes were independently labeled by both domain experts and a pool of 232 crowd annotators. This approach resulted in two parallel sets of annotations that reflect differing labeling perspectives. For each meme, labels include a binary classification (misogynistic or not), the type of misogyny, and its intensity. Beyond categorical labels, the dataset incorporates perspectivist metadata, capturing individual annotators’ perceptions of misogyny along with their demographic and socio-cultural background, including age, level of education, and social status. Each meme’s textual content was also automatically transcribed to enable multimodal analysis. This enriched benchmark enables nuanced research on the automatic detection of misogynistic content in online social media and supports investigations into how perceived misogyny varies across annotator profiles, allowing us to address the urgent challenge related to the diffusion of hateful content against women. Warning: this paper includes examples that may be offensive or harmful.
paper
Misogynous Memes, Italian Benchmark, Expert vs Crowd Annotation, Perspectivism
English
Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)
2025
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)
2025
https://ceur-ws.org/Vol-4112/43_main_long.pdf
open
Fersini, E., Gasparini, F., Rizzi, G., Saibene, A. (2025). MAMITA: Benchmarking Misogyny in Italian Memes. In Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025). CEUR-WS.
File in questo prodotto:
File Dimensione Formato  
Fersini-2025-CLiC-it 2025-VoR.pdf

accesso aperto

Descrizione: Articolo pubblicato su atti CEUR-WS
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 1.76 MB
Formato Adobe PDF
1.76 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/582702
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact