Bicocca Open Archive

With the increasing availability of multivariate datasets, asymmetric structures in the data ask for more realistic assumptions, with respect to the incredibly useful paradigm given by the Gaussian distribution. Moreover, in performing ML estimation we know that a few outliers in the data can affect the estimation, hence providing unreliable inference. Challenged by such issues, more flexible and solid tools for modeling heterogeneous skew data are needed. Our fuzzy clustering method is based on mixtures of Skew Gaussian components, endowed by the joint usage of impartial trimming and constrained estimation of scatter matrices, in a modified maximum likelihood approach. The algorithm generates a set of membership values, that are used to fuzzy partition the data set and to contribute to the robust estimates of the mixture parameters. The new methodology has been shown to be resistant to different types of contamination, by applying it on artificial data. A brief discussion on the tuning parameters has been developed, also with the help of some heuristic tools for their choice. Finally, synthetic and real dataset are analyzed, to show how intermediate membership values are estimated for observations lying at cluster overlap, while cluster cores are composed by observations that are assigned to a cluster in a crisp way.

Garcìa-Escudero, L., Greselin, F., Mayo-Iscar, A. (2018). Fuzzy clustering of multivariate skew data. In A. Colubi, C. Gatu (a cura di), COMPSTAT 2018 Book of Abstracts (pp. 33-33). COMPSTAT and CRoNoS.

Fuzzy clustering of multivariate skew data

Garcìa-Escudero, LA;Greselin, F;Mayo-Iscar, A

2018

Abstract

With the increasing availability of multivariate datasets, asymmetric structures in the data ask for more realistic assumptions, with respect to the incredibly useful paradigm given by the Gaussian distribution. Moreover, in performing ML estimation we know that a few outliers in the data can affect the estimation, hence providing unreliable inference. Challenged by such issues, more flexible and solid tools for modeling heterogeneous skew data are needed. Our fuzzy clustering method is based on mixtures of Skew Gaussian components, endowed by the joint usage of impartial trimming and constrained estimation of scatter matrices, in a modified maximum likelihood approach. The algorithm generates a set of membership values, that are used to fuzzy partition the data set and to contribute to the robust estimates of the mixture parameters. The new methodology has been shown to be resistant to different types of contamination, by applying it on artificial data. A brief discussion on the tuning parameters has been developed, also with the help of some heuristic tools for their choice. Finally, synthetic and real dataset are analyzed, to show how intermediate membership values are estimated for observations lying at cluster overlap, while cluster cores are composed by observations that are assigned to a cluster in a crisp way.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Capitolo o saggio
			
	Parole chiave
	
				fuzzy clustering, skew data, robust statistics
			
	Lingua del contenuto
	
				English
			
	Titolo del volume
	
				COMPSTAT 2018 Book of Abstracts
			
	Curatori del volume
	
				Colubi, A; Gatu, C
			
	Data di pubblicazione
	
				2018
			
	ISBN del volume
	
				978-9963-2227-3-5
			
	Editore
	
				COMPSTAT and CRoNoS
			
	Pagina iniziale
	
				33
			
	Pagina finale
	
				33
			
	Citazione
	
				Garcìa-Escudero, L., Greselin, F., Mayo-Iscar, A. (2018). Fuzzy clustering of multivariate skew data. In A. Colubi, C. Gatu (a cura di), COMPSTAT 2018 Book of Abstracts (pp. 33-33). COMPSTAT and CRoNoS.
			
	Fulltext
	
				none
			
	Appare nelle tipologie:
	
				03 - Contributo in libro

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/218272

Citazioni

ND

ND

Social impact