Different classification methods (Partial Least Squares Discriminant Analysis, Extended Canonical Variates Analysis and Linear Discriminant Analysis), in combination with variable selection approaches (Forward Selection and Genetic Algorithms), were compared, evaluating their capabilities in the geographical discrimination of wine samples. Sixty-two samples were analysed by means of dynamic headspace gas chromatography mass spectrometry (HS-GC-MS) and the entire chromatographic profile was considered to build the dataset. Since variable selection techniques pose a risk of overfitting when a large number of variables is used, a method for coupling data dimension reduction and variable selection was proposed. This approach compresses windows of the original data by retaining only significant components of local Principal Component Analysis models. The subsequent variable selection is then performed on these locally derived score variables. The results confirmed that the classification models achieved on the reduced data were better than those obtained on the entire chromatographic profile, with the exception of Extended Canonical Variates Analysis, which gave acceptable models in both cases. Copyright © 2008 John Wiley & Sons, Ltd.

Ballabio, D., Skov, T., Leardi, R., Bro, R. (2008). Classification of GC-MS measurements of wines by combining data dimension reduction and variable selection techniques. JOURNAL OF CHEMOMETRICS, 22(8), 457-463 [10.1002/cem.1173].

Classification of GC-MS measurements of wines by combining data dimension reduction and variable selection techniques

BALLABIO, DAVIDE;
2008

Abstract

Different classification methods (Partial Least Squares Discriminant Analysis, Extended Canonical Variates Analysis and Linear Discriminant Analysis), in combination with variable selection approaches (Forward Selection and Genetic Algorithms), were compared, evaluating their capabilities in the geographical discrimination of wine samples. Sixty-two samples were analysed by means of dynamic headspace gas chromatography mass spectrometry (HS-GC-MS) and the entire chromatographic profile was considered to build the dataset. Since variable selection techniques pose a risk of overfitting when a large number of variables is used, a method for coupling data dimension reduction and variable selection was proposed. This approach compresses windows of the original data by retaining only significant components of local Principal Component Analysis models. The subsequent variable selection is then performed on these locally derived score variables. The results confirmed that the classification models achieved on the reduced data were better than those obtained on the entire chromatographic profile, with the exception of Extended Canonical Variates Analysis, which gave acceptable models in both cases. Copyright © 2008 John Wiley & Sons, Ltd.
Articolo in rivista - Articolo scientifico
classification; variable selection; data reduction; wine; GC-MS; chemometrics
English
2008
22
8
457
463
none
Ballabio, D., Skov, T., Leardi, R., Bro, R. (2008). Classification of GC-MS measurements of wines by combining data dimension reduction and variable selection techniques. JOURNAL OF CHEMOMETRICS, 22(8), 457-463 [10.1002/cem.1173].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/20559
Citazioni
  • Scopus 59
  • ???jsp.display-item.citation.isi??? 55
Social impact