In contrast to single-view learning, multi-view learning trains simultaneously distinct algorithms on disjoint subsets of features (the views), and jointly optimizes them, so that they come to a consensus. Multi-view learning is typically used when the data are described by a large number of features. It aims at exploiting the different statistical properties of distinct views. A task to be performed before multi-view learning - in the case where the features have no natural groupings - is multi-view generation (MVG): it consists in partitioning the feature set in subsets (views) characterized by some desired properties. Given a dataset, in the form of a table with a large number of columns, the desired solution of the MVG problem is a partition of the columns that optimizes an objective function, encoding typical requirements. If the class labels are available, one wants to minimize the inter-view redundancy in target prediction and maximize consistency. If the class labels are not available, one wants simply to minimize inter-view redundancy (minimize the information each view has about the others). In this work, we approach the MVG problem in the latter, unsupervised, setting. Our approach is based on the transposition of the data table: the original instance rows are mapped into columns (the 'pseudo-features'), while the original feature columns become rows (the 'pseudo-instances'). The latter can then be partitioned by any suitable standard instance-partitioning algorithm: the resulting groups can be considered as groups of the original features, i.e. views, solution of the MVG problem. We demonstrate the approach using k-means and the standard benchmark MNIST dataset of handwritten digits.

Mio, C., Gianini, G., Damiani, E. (2019). K-Means Clustering in Dual Space for Unsupervised Feature Partitioning in Multi-view Learning. In Proceedings - 14th International Conference on Signal Image Technology and Internet Based Systems, SITIS 2018 (pp.1-8). IEEE [10.1109/SITIS.2018.00012].

K-Means Clustering in Dual Space for Unsupervised Feature Partitioning in Multi-view Learning

Gianini, G;
2019

Abstract

In contrast to single-view learning, multi-view learning trains simultaneously distinct algorithms on disjoint subsets of features (the views), and jointly optimizes them, so that they come to a consensus. Multi-view learning is typically used when the data are described by a large number of features. It aims at exploiting the different statistical properties of distinct views. A task to be performed before multi-view learning - in the case where the features have no natural groupings - is multi-view generation (MVG): it consists in partitioning the feature set in subsets (views) characterized by some desired properties. Given a dataset, in the form of a table with a large number of columns, the desired solution of the MVG problem is a partition of the columns that optimizes an objective function, encoding typical requirements. If the class labels are available, one wants to minimize the inter-view redundancy in target prediction and maximize consistency. If the class labels are not available, one wants simply to minimize inter-view redundancy (minimize the information each view has about the others). In this work, we approach the MVG problem in the latter, unsupervised, setting. Our approach is based on the transposition of the data table: the original instance rows are mapped into columns (the 'pseudo-features'), while the original feature columns become rows (the 'pseudo-instances'). The latter can then be partitioned by any suitable standard instance-partitioning algorithm: the resulting groups can be considered as groups of the original features, i.e. views, solution of the MVG problem. We demonstrate the approach using k-means and the standard benchmark MNIST dataset of handwritten digits.
paper
Consensus clustering; Dual space clustering; K-means; Multi-view learning;
English
14th International Conference on Signal Image Technology and Internet Based Systems, SITIS 2018 - 26 November 2018 through 29 November 2018
2018
DiBaja, GS; Gallo, L; Yetongnon, K; Dipanda, A; CastrillonSantana, M; Chbeir, R
Proceedings - 14th International Conference on Signal Image Technology and Internet Based Systems, SITIS 2018
9781538693858
2019
1
8
8706154
reserved
Mio, C., Gianini, G., Damiani, E. (2019). K-Means Clustering in Dual Space for Unsupervised Feature Partitioning in Multi-view Learning. In Proceedings - 14th International Conference on Signal Image Technology and Internet Based Systems, SITIS 2018 (pp.1-8). IEEE [10.1109/SITIS.2018.00012].
File in questo prodotto:
File Dimensione Formato  
Mio-2019-SITIS 2018-AAM.pdf

Solo gestori archivio

Descrizione: Intervento a convegno
Tipologia di allegato: Author’s Accepted Manuscript, AAM (Post-print)
Licenza: Tutti i diritti riservati
Dimensione 2.38 MB
Formato Adobe PDF
2.38 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Mio-2019-SITIS 2018-VoR.pdf

Solo gestori archivio

Descrizione: Intervento a convegno
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Tutti i diritti riservati
Dimensione 578.13 kB
Formato Adobe PDF
578.13 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/454841
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 1
Social impact