Partially exchangeable datasets are characterized by observations grouped into known, heterogeneous units. The recently developed Common Atoms Model (CAM) is a Bayesian nonparametric technique suited for analyzing this type of data. CAM induces a two-layered clustering structure: one across observations and another across units. In particular, the units are clustered according to their distributional similarities. In this article, we illustrate the versatility of CAM with an application to an openly available Spotify dataset. The dataset contains quantitative audio features for a large number of songs grouped by artists. After describing the data preprocessing steps, we employ CAM to group the Spotify artists according to the distributions of the energy of their songs.
Denti, F., Camerlenghi, F., Guindani, M., Mira, A. (2022). Clustering artists based on the energy distributions of their songs on Spotify via the Common Atoms Model Clustering di artisti in base alla distribuzione dell’energia delle loro canzoni su Spotify con il Common Atom Model. In Book of Short Papers SIS 2022 (pp.121-126).
Clustering artists based on the energy distributions of their songs on Spotify via the Common Atoms Model Clustering di artisti in base alla distribuzione dell’energia delle loro canzoni su Spotify con il Common Atom Model
Federico Camerlenghi;
2022
Abstract
Partially exchangeable datasets are characterized by observations grouped into known, heterogeneous units. The recently developed Common Atoms Model (CAM) is a Bayesian nonparametric technique suited for analyzing this type of data. CAM induces a two-layered clustering structure: one across observations and another across units. In particular, the units are clustered according to their distributional similarities. In this article, we illustrate the versatility of CAM with an application to an openly available Spotify dataset. The dataset contains quantitative audio features for a large number of songs grouped by artists. After describing the data preprocessing steps, we employ CAM to group the Spotify artists according to the distributions of the energy of their songs.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.