Cluster-weighted models (CWMs) are a flexible family of mixture models for fitting the joint distribution of a random vector composed of a response variable and a set of covariates. CWMs act as a convex combination of the products of the marginal distribution of the covariates and the conditional distribution of the response given the covariates. In this paper, we introduce a broad family of CWMs in which the component conditional distributions are assumed to belong to the exponential family and the covariates are allowed to be of mixed-type. Under the assumption of Gaussian covariates, sufficient conditions for model identifiability are provided. Moreover, maximum likelihood parameter estimates are derived using the EM algorithm. Parameter recovery, classification assessment, and performance of some information criteria are investigated through a broad simulation design. An application to real data is finally presented, with the proposed model outperforming other well-established mixture-based approaches.
Ingrassia, S., Punzo, A., Vittadini, G., Minotti, S. (2015). The Generalized Linear Mixed Cluster-Weighted Model. JOURNAL OF CLASSIFICATION, 32(1), 85-113 [10.1007/s00357-015-9175-1].
The Generalized Linear Mixed Cluster-Weighted Model
VITTADINI, GIORGIO;MINOTTI, SIMONA CATERINA
2015
Abstract
Cluster-weighted models (CWMs) are a flexible family of mixture models for fitting the joint distribution of a random vector composed of a response variable and a set of covariates. CWMs act as a convex combination of the products of the marginal distribution of the covariates and the conditional distribution of the response given the covariates. In this paper, we introduce a broad family of CWMs in which the component conditional distributions are assumed to belong to the exponential family and the covariates are allowed to be of mixed-type. Under the assumption of Gaussian covariates, sufficient conditions for model identifiability are provided. Moreover, maximum likelihood parameter estimates are derived using the EM algorithm. Parameter recovery, classification assessment, and performance of some information criteria are investigated through a broad simulation design. An application to real data is finally presented, with the proposed model outperforming other well-established mixture-based approaches.File | Dimensione | Formato | |
---|---|---|---|
82286.pdf
Solo gestori archivio
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Dimensione
469.18 kB
Formato
Adobe PDF
|
469.18 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.