The aim of this work is to define a speech emotion recognition (SER) model able to recognize positive, neutral and negative emotions in natural conversations of Italian elderly people. Several datasets for SER are available in the literature. However most of them are in English or Chinese, have been recorded while actors and actresses pronounce short phrases and thus are not related to natural conversation. Moreover only few speeches among all the databases are related to elderly people. Therefore, in this work, a multi-language and multi-age corpus is considered merging a dataset in English, that includes also elderly people, with a dataset in Italian. A general model, trained on young and adult English actors and actresses is proposed, based on XGBoost. Then two strategies of domain adaptation are proposed to adapt the model either to elderly people and to Italian speakers. The results suggest that this approach increases the classification performance, underlining also that new datasets should be collected.

Gasparini, F., Grossi, A. (2022). Sentiment recognition of Italian elderly through domain adaptation on cross-corpus speech dataset. In Proceedings of the Italian Workshop on Artificial Intelligence for an Ageing Society 2022 co-located with 21st International Conference of the Italian Association for Artificial Intelligence (AIxIA 2022) (pp.12-28). CEUR-WS.

Sentiment recognition of Italian elderly through domain adaptation on cross-corpus speech dataset

Gasparini F.;Grossi A.
2022

Abstract

The aim of this work is to define a speech emotion recognition (SER) model able to recognize positive, neutral and negative emotions in natural conversations of Italian elderly people. Several datasets for SER are available in the literature. However most of them are in English or Chinese, have been recorded while actors and actresses pronounce short phrases and thus are not related to natural conversation. Moreover only few speeches among all the databases are related to elderly people. Therefore, in this work, a multi-language and multi-age corpus is considered merging a dataset in English, that includes also elderly people, with a dataset in Italian. A general model, trained on young and adult English actors and actresses is proposed, based on XGBoost. Then two strategies of domain adaptation are proposed to adapt the model either to elderly people and to Italian speakers. The results suggest that this approach increases the classification performance, underlining also that new datasets should be collected.
paper
cross-corpus SER; cross-language SER; Domain adaptation; Sentiment recognition; Speech emotion recognition;
English
3rd Italian Workshop on Artificial Intelligence for an Ageing Society, AIxAS 2022 - November 28th - December 2nd, 2022
2022
Palumbo, F; Gasparini, F; Fracasso, F
Proceedings of the Italian Workshop on Artificial Intelligence for an Ageing Society 2022 co-located with 21st International Conference of the Italian Association for Artificial Intelligence (AIxIA 2022)
2022
3367
12
28
https://ceur-ws.org/Vol-3367/
open
Gasparini, F., Grossi, A. (2022). Sentiment recognition of Italian elderly through domain adaptation on cross-corpus speech dataset. In Proceedings of the Italian Workshop on Artificial Intelligence for an Ageing Society 2022 co-located with 21st International Conference of the Italian Association for Artificial Intelligence (AIxIA 2022) (pp.12-28). CEUR-WS.
File in questo prodotto:
File Dimensione Formato  
Gasparini-2022-AIxAS-VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 328.75 kB
Formato Adobe PDF
328.75 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/523804
Citazioni
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
Social impact