In this article, we propose a general framework for the development of external evaluation measures for soft clustering. Our proposal is based on the interpretation of soft clustering as representing uncertain information about an underlying, unknown hard clustering. We present a general construction, based on optimal transport theory, by which any evaluation measure can be naturally extended to soft clustering. The proposed “transport-based measure” provides an objective, interval-valued comparison index that represents the range of compatibility between two soft clusterings. We study the metric and complexity properties of the proposed approach, as well as its relationship with other existing proposals. We also propose approximation and bounding algorithms that make the approach practical for large datasets. Finally, we illustrate the application of the proposed method through two computational experiments.

Campagner, A., Ciucci, D., Denoeux, T. (2023). A general framework for evaluating and comparing soft clusterings. INFORMATION SCIENCES, 623(April 2023), 70-93 [10.1016/j.ins.2022.11.114].

A general framework for evaluating and comparing soft clusterings

Campagner A.
;
Ciucci D.;
2023

Abstract

In this article, we propose a general framework for the development of external evaluation measures for soft clustering. Our proposal is based on the interpretation of soft clustering as representing uncertain information about an underlying, unknown hard clustering. We present a general construction, based on optimal transport theory, by which any evaluation measure can be naturally extended to soft clustering. The proposed “transport-based measure” provides an objective, interval-valued comparison index that represents the range of compatibility between two soft clusterings. We study the metric and complexity properties of the proposed approach, as well as its relationship with other existing proposals. We also propose approximation and bounding algorithms that make the approach practical for large datasets. Finally, we illustrate the application of the proposed method through two computational experiments.
Articolo in rivista - Articolo scientifico
Clustering analysis; Comparison; Evaluation; Soft clustering; Validation;
English
13-dic-2022
2023
623
April 2023
70
93
embargoed_20241219
Campagner, A., Ciucci, D., Denoeux, T. (2023). A general framework for evaluating and comparing soft clusterings. INFORMATION SCIENCES, 623(April 2023), 70-93 [10.1016/j.ins.2022.11.114].
File in questo prodotto:
File Dimensione Formato  
Campagner-2022-Information Scie-VoR.pdf

embargo fino al 19/12/2024

Descrizione: Research Article
Tipologia di allegato: Author’s Accepted Manuscript, AAM (Post-print)
Licenza: Creative Commons
Dimensione 1.07 MB
Formato Adobe PDF
1.07 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/401877
Citazioni
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 10
Social impact