This paper proposes a clustering procedure for samples of multivariate functions in (L2(I))J, with J≥ 1. This method is based on a k-means algorithm in which the distance between the curves is measured with a metric that generalizes the Mahalanobis distance in Hilbert spaces, considering the correlation and the variability along all the components of the functional data. The proposed procedure has been studied in simulation and compared with the k-means based on other distances typically adopted for clustering multivariate functional data. In these simulations, it is shown that the k-means algorithm with the generalized Mahalanobis distance provides the best clustering performances, both in terms of mean and standard deviation of the number of misclassified curves. Finally, the proposed method has been applied to two case studies, concerning ECG signals and growth curves, where the results obtained in simulation are confirmed and strengthened.

Martino, A., Ghiglietti, A., Ieva, F., Paganoni, A. (2019). A k-means procedure based on a Mahalanobis type distance for clustering multivariate functional data. STATISTICAL METHODS & APPLICATIONS, 28(2), 301-322 [10.1007/s10260-018-00446-6].

A k-means procedure based on a Mahalanobis type distance for clustering multivariate functional data

Ghiglietti A.;
2019

Abstract

This paper proposes a clustering procedure for samples of multivariate functions in (L2(I))J, with J≥ 1. This method is based on a k-means algorithm in which the distance between the curves is measured with a metric that generalizes the Mahalanobis distance in Hilbert spaces, considering the correlation and the variability along all the components of the functional data. The proposed procedure has been studied in simulation and compared with the k-means based on other distances typically adopted for clustering multivariate functional data. In these simulations, it is shown that the k-means algorithm with the generalized Mahalanobis distance provides the best clustering performances, both in terms of mean and standard deviation of the number of misclassified curves. Finally, the proposed method has been applied to two case studies, concerning ECG signals and growth curves, where the results obtained in simulation are confirmed and strengthened.
Articolo in rivista - Articolo scientifico
Distances in L2; k-means algorithm; Multivariate functional data;
English
30-ott-2018
2019
28
2
301
322
none
Martino, A., Ghiglietti, A., Ieva, F., Paganoni, A. (2019). A k-means procedure based on a Mahalanobis type distance for clustering multivariate functional data. STATISTICAL METHODS & APPLICATIONS, 28(2), 301-322 [10.1007/s10260-018-00446-6].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/391718
Citazioni
  • Scopus 19
  • ???jsp.display-item.citation.isi??? 16
Social impact