Bicocca Open Archive

By leveraging the advancements in Natural Language Processing and Cognitive Computing, conversational artificial intelligence (AI) has become more mature over the last years. It serves humans in a broad range of applications in business enterprises, government, health-care, and entertaining, and it is getting more embedded into peoples' lives. However, despite the recent improvements, we are still far away from a robust or general AI comparable to human intelligence, especially when it comes to adaptive intelligence able to settle into non-standard and noisy environments. In this paper, we bring to light that emotion in speech negatively affects automatic speech recognition and automatic human inputs understanding. Therefore, emotion is to be considered as a noise compromising the understanding of what the user says and consequently messing up the whole interaction with conversational technologies. For this study, Google Cloud Speech-to-Text and IBM Watson Speech-to-Text have been used.

Catania, F., Crovari, P., Spitale, M., Garzotto, F. (2019). Automatic Speech Recognition: Do Emotions Matter?. In 2019 IEEE International Conference on Conversational Data & Knowledge Engineering (CDKE) (pp.9-16). IEEE [10.1109/CDKE46621.2019.00009].

Automatic Speech Recognition: Do Emotions Matter?

Catania, F;Crovari, P;Spitale, M;Garzotto, F

2019

Abstract

By leveraging the advancements in Natural Language Processing and Cognitive Computing, conversational artificial intelligence (AI) has become more mature over the last years. It serves humans in a broad range of applications in business enterprises, government, health-care, and entertaining, and it is getting more embedded into peoples' lives. However, despite the recent improvements, we are still far away from a robust or general AI comparable to human intelligence, especially when it comes to adaptive intelligence able to settle into non-standard and noisy environments. In this paper, we bring to light that emotion in speech negatively affects automatic speech recognition and automatic human inputs understanding. Therefore, emotion is to be considered as a noise compromising the understanding of what the user says and consequently messing up the whole interaction with conversational technologies. For this study, Google Cloud Speech-to-Text and IBM Watson Speech-to-Text have been used.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				Automatic Speech Recognition; Conversational Technology; Emotional speech; Word Error Rate;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				1st IEEE International Conference on Conversational Data and Knowledge Engineering, CDKE 2019 - 09-11 December 2019
			
	Anno del convegno
	
				2019
			
	Titolo degli atti
	
				2019 IEEE International Conference on Conversational Data & Knowledge Engineering (CDKE)
			
	ISBN del volume degli atti
	
				9781728160887
			
	Data di pubblicazione
	
				2019
			
	Pagina iniziale
	
				9
			
	Pagina finale
	
				16
			
	Article number
	
				8949383
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1109/CDKE46621.2019.00009
			
	Fulltext
	
				none
			
	Citazione
	
				Catania, F., Crovari, P., Spitale, M., Garzotto, F. (2019). Automatic Speech Recognition: Do Emotions Matter?. In 2019 IEEE International Conference on Conversational Data & Knowledge Engineering (CDKE) (pp.9-16). IEEE [10.1109/CDKE46621.2019.00009].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/556595

Citazioni

6

4

Social impact