Bicocca Open Archive

Although several researches have been carried out in the field of Speech Emotion Recognition (SER), only few of them consider people of different ages or languages. In particular, most of the SER datasets reported in the literature are collected from young adults or take into account a single language, such as English or Chinese. These datasets tend to be poorly heterogeneous and dependent on the context in which they are collected. In general they are composed of acted utterances or they are recorded in situations properly designed to evoke certain emotions. This paper proposes a framework that allows to benefit of complementary information coming from multisource data to train a general SER model. To merge different sources, proper preprocessing steps to normalize the data source, the type of recorded speeches, and the subjects considered are here described. Furthermore we present a domain adaptation strategy that allows to benefit of the general model adapting it to a certain language and/or a certain population age. In particular here we are interested in developing SER models that consider Italian older adults. Preliminary results that consider several sources for training and different language as test set confirm the validity of the proposal.

Grossi, A., Fratti, G., Gasparini, F. (2023). A computational framework for speech emotion recognition in case of multisource data. In Proceedings of the 4th Italian Workshop on Artificial Intelligence for an Ageing Society co-located with 22nd International Conference of the Italian Association for Artificial Intelligence (AIxIA 2023) (pp.113-126). CEUR-WS.

A computational framework for speech emotion recognition in case of multisource data

Grossi A.;Fratti G.;Gasparini F.

2023

Abstract

Although several researches have been carried out in the field of Speech Emotion Recognition (SER), only few of them consider people of different ages or languages. In particular, most of the SER datasets reported in the literature are collected from young adults or take into account a single language, such as English or Chinese. These datasets tend to be poorly heterogeneous and dependent on the context in which they are collected. In general they are composed of acted utterances or they are recorded in situations properly designed to evoke certain emotions. This paper proposes a framework that allows to benefit of complementary information coming from multisource data to train a general SER model. To merge different sources, proper preprocessing steps to normalize the data source, the type of recorded speeches, and the subjects considered are here described. Furthermore we present a domain adaptation strategy that allows to benefit of the general model adapting it to a certain language and/or a certain population age. In particular here we are interested in developing SER models that consider Italian older adults. Preliminary results that consider several sources for training and different language as test set confirm the validity of the proposal.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				domanin adaptation; Keywords speech emotion recognition; multisource; older adults; XGboost;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				4th Italian Workshop on Artificial Intelligence for an Ageing Society, AIxAS 2023 - 9 November 2023
			
	Anno del convegno
	
				2023
			
	Titolo degli atti
	
				Proceedings of the 4th Italian Workshop on Artificial Intelligence for an Ageing Society
co-located with 22nd International Conference of the Italian Association for Artificial Intelligence (AIxIA 2023)
			
	Collana o serie
	
				CEUR WORKSHOP PROCEEDINGS
			
	Data di pubblicazione
	
				2023
			
	Numero del volume
	
				3623
			
	Pagina iniziale
	
				113
			
	Pagina finale
	
				126
			
	URL alternativo
	
				https://ceur-ws.org/Vol-3623/
			
	Fulltext
	
				open
			
	Citazione
	
				Grossi, A., Fratti, G., Gasparini, F. (2023). A computational framework for speech emotion recognition in case of multisource data. In Proceedings of the 4th Italian Workshop on Artificial Intelligence for an Ageing Society
co-located with 22nd International Conference of the Italian Association for Artificial Intelligence (AIxIA 2023) (pp.113-126). CEUR-WS.
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
Grossi-2023-AIxAS-VoR.pdf accesso aperto Descrizione: Intervento a convegno - AIxAS 2023 paper 11 Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 1.09 MB Formato Adobe PDF Visualizza/Apri	1.09 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/523800

Citazioni

0

ND

Social impact