In an era where accumulating data is easy and storing it inexpensive, feature selection plays a central role in helping to reduce the high-dimensionality of huge amounts of otherwise meaningless data. In this paper, we propose a graph-based method for feature selection that ranks features by identifying the most important ones into arbitrary set of cues. Mapping the problem on an affinity graph - where features are the nodes - the solution is given by assessing the importance of nodes through some indicators of centrality, in particular, the Eigenvector Centrality (EC). The gist of EC is to estimate the importance of a feature as a function of the importance of its neighbors. Ranking central nodes individuates candidate features, which turn out to be effective from a classification point of view, as proved by a thoroughly experimental section. Our approach has been tested on 7 diverse datasets from recent literature (e.g., biological data, object recognition, among others), and compared against filter, embedded, and wrappers methods. The results are remarkable in terms of accuracy, stability and low execution time.

Roffo, G., Melzi, S. (2017). Ranking to Learn: Feature Ranking and Selection via Eigenvector Centrality. In New Frontiers in Mining Complex Patterns. NFMCP 2016 (pp.19-35). Springer [10.1007/978-3-319-61461-8_2].

Ranking to Learn: Feature Ranking and Selection via Eigenvector Centrality

Melzi Simone
2017

Abstract

In an era where accumulating data is easy and storing it inexpensive, feature selection plays a central role in helping to reduce the high-dimensionality of huge amounts of otherwise meaningless data. In this paper, we propose a graph-based method for feature selection that ranks features by identifying the most important ones into arbitrary set of cues. Mapping the problem on an affinity graph - where features are the nodes - the solution is given by assessing the importance of nodes through some indicators of centrality, in particular, the Eigenvector Centrality (EC). The gist of EC is to estimate the importance of a feature as a function of the importance of its neighbors. Ranking central nodes individuates candidate features, which turn out to be effective from a classification point of view, as proved by a thoroughly experimental section. Our approach has been tested on 7 diverse datasets from recent literature (e.g., biological data, object recognition, among others), and compared against filter, embedded, and wrappers methods. The results are remarkable in terms of accuracy, stability and low execution time.
paper
Data mining; Feature selection; High dimensionality; Ranking;
English
5th International Workshop on New Frontiers in Mining Complex Patterns, NFMCP 2016 was held in conjunction with the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML-PKDD 2016 - 19 September 2016 through 19 September 2016
2016
Appice, A; Ceci, M; Loglisci, C; Masciari, E; Raś, Z
New Frontiers in Mining Complex Patterns. NFMCP 2016
978-3-319-61460-1
2-lug-2017
2017
10312
19
35
https://link.springer.com/book/10.1007/978-3-319-61461-8?page=1#toc
reserved
Roffo, G., Melzi, S. (2017). Ranking to Learn: Feature Ranking and Selection via Eigenvector Centrality. In New Frontiers in Mining Complex Patterns. NFMCP 2016 (pp.19-35). Springer [10.1007/978-3-319-61461-8_2].
File in questo prodotto:
File Dimensione Formato  
NFmcp2016_paper_13.pdf

Solo gestori archivio

Dimensione 630.25 kB
Formato Adobe PDF
630.25 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/350422
Citazioni
  • Scopus 79
  • ???jsp.display-item.citation.isi??? ND
Social impact