A clustering algorithm that combines the advantages of fuzzy clustering and robust statistical estimators is presented. It is based on mixtures of Factor Analyzers, endowed by the joint usage of trimming and the constrained estimation of scatter matrices, in a modified maximum likelihood approach. The algorithm generates a set of membership values, that are used to fuzzy partition the data set and to contribute to the robust estimates of the mixture parameters. The adoption of clusters modeled by Gaussian Factor Analysis allows for dimension reduction and for discovering local linear structures in the data. The new methodology has been shown to be resistant to different types of contamination, by applying it on artificial data. A brief discussion on the tuning parameters, such as the trimming level, the fuzzifier parameter, the number of clusters and the value of the scatter matrices constraint, has been developed, also with the help of some heuristic tools for their choice. Finally, a real data set has been analyzed, to show how intermediate membership values are estimated for observations lying at cluster overlap, while cluster cores are composed by observations that are assigned to a cluster in a crisp way

Garcìa-Escudero, L., Greselin, F., Iscar, A. (2018). Robust fuzzy and parsimonious clustering based on mixtures of Factor Analyzers. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 94(March 2018), 60-75 [10.1016/j.ijar.2018.01.001].

Robust fuzzy and parsimonious clustering based on mixtures of Factor Analyzers

Greselin, F
;
2018

Abstract

A clustering algorithm that combines the advantages of fuzzy clustering and robust statistical estimators is presented. It is based on mixtures of Factor Analyzers, endowed by the joint usage of trimming and the constrained estimation of scatter matrices, in a modified maximum likelihood approach. The algorithm generates a set of membership values, that are used to fuzzy partition the data set and to contribute to the robust estimates of the mixture parameters. The adoption of clusters modeled by Gaussian Factor Analysis allows for dimension reduction and for discovering local linear structures in the data. The new methodology has been shown to be resistant to different types of contamination, by applying it on artificial data. A brief discussion on the tuning parameters, such as the trimming level, the fuzzifier parameter, the number of clusters and the value of the scatter matrices constraint, has been developed, also with the help of some heuristic tools for their choice. Finally, a real data set has been analyzed, to show how intermediate membership values are estimated for observations lying at cluster overlap, while cluster cores are composed by observations that are assigned to a cluster in a crisp way
Articolo in rivista - Articolo scientifico
Fuzzy clustering; Robust clustering; Unsupervised learning; Factor analysis, Hard contrast; Dimension reduction; Outliers identification
English
2018
94
March 2018
60
75
none
Garcìa-Escudero, L., Greselin, F., Iscar, A. (2018). Robust fuzzy and parsimonious clustering based on mixtures of Factor Analyzers. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 94(March 2018), 60-75 [10.1016/j.ijar.2018.01.001].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/186751
Citazioni
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 3
Social impact