Emotion recognition, and in particular acial emotion recognition (FER), is among the most controversial applications of machine learning, not least because of its ethical implications for human subjects. In this article, we address the controversial conjecture that machines can read emotions from our facial expressions by asking whether this task can be performed reliably. This means, rather than considering the potential harms or scientific soundness of facial emotion recognition systems, focusing on the reliability of the ground truths used to develop emotion recognition systems, assessing how well different human observers agree on the emotions they detect in subjects’ faces. Additionally, we discuss the extent to which sharing context can help observers agree on the emotions they perceive on subjects’ faces. Briefly, we demonstrate that when large and heterogeneous samples of observers are involved, the task of emotion detection from static images crumbles into inconsistency. We thus reveal that any endeavour to understand human behaviour from large sets of labelled patterns is over-ambitious, even if it were technically feasible. We conclude that we cannot speak of actual accuracy for facial emotion recognition systems for any practical purposes.

Cabitza, F., Campagner, A., Mattioli, M. (2022). The unbearable (technical) unreliability of automated facial emotion recognition. BIG DATA & SOCIETY, 9(2) [10.1177/20539517221129549].

The unbearable (technical) unreliability of automated facial emotion recognition

Cabitza, F
;
Campagner, A;
2022

Abstract

Emotion recognition, and in particular acial emotion recognition (FER), is among the most controversial applications of machine learning, not least because of its ethical implications for human subjects. In this article, we address the controversial conjecture that machines can read emotions from our facial expressions by asking whether this task can be performed reliably. This means, rather than considering the potential harms or scientific soundness of facial emotion recognition systems, focusing on the reliability of the ground truths used to develop emotion recognition systems, assessing how well different human observers agree on the emotions they detect in subjects’ faces. Additionally, we discuss the extent to which sharing context can help observers agree on the emotions they perceive on subjects’ faces. Briefly, we demonstrate that when large and heterogeneous samples of observers are involved, the task of emotion detection from static images crumbles into inconsistency. We thus reveal that any endeavour to understand human behaviour from large sets of labelled patterns is over-ambitious, even if it were technically feasible. We conclude that we cannot speak of actual accuracy for facial emotion recognition systems for any practical purposes.
Articolo in rivista - Articolo scientifico
annotation; Face emotion recognition; ground truth; reliability; survey; user study;
English
2-ott-2022
2022
9
2
none
Cabitza, F., Campagner, A., Mattioli, M. (2022). The unbearable (technical) unreliability of automated facial emotion recognition. BIG DATA & SOCIETY, 9(2) [10.1177/20539517221129549].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/394391
Citazioni
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 6
Social impact