Music has been recognized as an effective tool that could be beneficial in several applications aimed at increasing people’s well-being. A personalized music recommender system can suggest playlists based on user’s preferences and considering induced emotions. Being a subjective task, it is important to define a starting solid and generalizable Music Emotion Recognition (MER) model. This model can be then refined to be adapted to the user’s specific responses, ensuring a proper interaction between the recommendation system and its user. In this paper, a MER model relying on a multi-source input, composed of songs belonging to four publicly available datasets, is presented. The proposed model is based on EfficientNetB3, designed to provide high performance while being computationally efficient. Moreover, data splitting, layer modifications, and parameter setting are proposed to reduce the model overfitting. Our proposal achieves performance comparable with those in the state of the art, providing a robust model to be adapted to a user’s emotional responses in the definition of a music recommender system.

Cazzaniga, S., Gasparini, F., Saibene, A. (2024). A multi-source deep learning model for music emotion recognition. In Proceedings of the 3rd Workshop on Artificial Intelligence for Human-Machine Interaction 2024 co-located with the 23rd International Conference of the Italian Association for Artificial Intelligence (AIxIA 2024) (pp.33-43). CEUR-WS.

A multi-source deep learning model for music emotion recognition

Gasparini F.
Secondo
;
Saibene A.
Ultimo
2024

Abstract

Music has been recognized as an effective tool that could be beneficial in several applications aimed at increasing people’s well-being. A personalized music recommender system can suggest playlists based on user’s preferences and considering induced emotions. Being a subjective task, it is important to define a starting solid and generalizable Music Emotion Recognition (MER) model. This model can be then refined to be adapted to the user’s specific responses, ensuring a proper interaction between the recommendation system and its user. In this paper, a MER model relying on a multi-source input, composed of songs belonging to four publicly available datasets, is presented. The proposed model is based on EfficientNetB3, designed to provide high performance while being computationally efficient. Moreover, data splitting, layer modifications, and parameter setting are proposed to reduce the model overfitting. Our proposal achieves performance comparable with those in the state of the art, providing a robust model to be adapted to a user’s emotional responses in the definition of a music recommender system.
paper
EfficientNetB3; Mel-spetrograms; Music Emotion Recognition (MER); perceived emotion;
English
3rd Workshop on Artificial Intelligence for Human-Machine Interaction 2024 co-located with the 23rd International Conference of the Italian Association for Artificial Intelligence (AIxIA 2024) - November 26, 2024
2024
Saibene, A; Corchs, S; Fontana, S; Solè-Casals, J
Proceedings of the 3rd Workshop on Artificial Intelligence for Human-Machine Interaction 2024 co-located with the 23rd International Conference of the Italian Association for Artificial Intelligence (AIxIA 2024)
2024
3903
33
43
https://ceur-ws.org/Vol-3903/
open
Cazzaniga, S., Gasparini, F., Saibene, A. (2024). A multi-source deep learning model for music emotion recognition. In Proceedings of the 3rd Workshop on Artificial Intelligence for Human-Machine Interaction 2024 co-located with the 23rd International Conference of the Italian Association for Artificial Intelligence (AIxIA 2024) (pp.33-43). CEUR-WS.
File in questo prodotto:
File Dimensione Formato  
Cazzaniga-2024-AIxHMI-VoR.pdf

accesso aperto

Descrizione: This volume and its papers are published under the Creative Commons License Attribution 4.0 International (CC BY 4.0).
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 1.44 MB
Formato Adobe PDF
1.44 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/547741
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact