A modified version of the Sequential Replacement (SR) algorithm for variable selection is proposed, featuring modern functionalities aimed to: 1) reduce the computational time; 2) estimate the real predictivity of the model; 3) identify models suffering from pathologies. This redesigned version was called Reshaped Sequential Replacement (RSR) algorithm.The RSR algorithm was applied to several datasets in regression and classification and was compared with the original SR method by means of a Design of Experiments (DoE). The DoE took into account the functions that affect the outcome of the search in terms of generated combinations of variables and time required for computation. The results were also compared with published models on the same datasets, taken as reference, and obtained by different variable selection methods.This latter comparison showed that the RSR algorithm managed to find good subsets of variables on all datasets, even though the reference models were not always found. When the reference model was not found the RSR algorithm returned comparable or better subsets of variables, evaluated in cross-validation. The DoE showed that the inclusion of the additional functions allowed to obtain models with equivalent or better performances in a decreased computational time compared to the original SR method

Cassotti, M., Grisoni, F., Todeschini, R. (2014). Reshaped Sequential Replacement algorithm: an efficient approach to variable selection. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 133, 136-148 [10.1016/j.chemolab.2014.01.011].

Reshaped Sequential Replacement algorithm: an efficient approach to variable selection

CASSOTTI, MATTEO;GRISONI, FRANCESCA;TODESCHINI, ROBERTO
2014

Abstract

A modified version of the Sequential Replacement (SR) algorithm for variable selection is proposed, featuring modern functionalities aimed to: 1) reduce the computational time; 2) estimate the real predictivity of the model; 3) identify models suffering from pathologies. This redesigned version was called Reshaped Sequential Replacement (RSR) algorithm.The RSR algorithm was applied to several datasets in regression and classification and was compared with the original SR method by means of a Design of Experiments (DoE). The DoE took into account the functions that affect the outcome of the search in terms of generated combinations of variables and time required for computation. The results were also compared with published models on the same datasets, taken as reference, and obtained by different variable selection methods.This latter comparison showed that the RSR algorithm managed to find good subsets of variables on all datasets, even though the reference models were not always found. When the reference model was not found the RSR algorithm returned comparable or better subsets of variables, evaluated in cross-validation. The DoE showed that the inclusion of the additional functions allowed to obtain models with equivalent or better performances in a decreased computational time compared to the original SR method
Articolo in rivista - Articolo scientifico
Multivariate analysis; QUIK rule; Roulette wheel; Sequential replacement; Tabu list; Variable selection
English
2014
133
136
148
none
Cassotti, M., Grisoni, F., Todeschini, R. (2014). Reshaped Sequential Replacement algorithm: an efficient approach to variable selection. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 133, 136-148 [10.1016/j.chemolab.2014.01.011].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/51065
Citazioni
  • Scopus 17
  • ???jsp.display-item.citation.isi??? 14
Social impact