Overview: The electroencephalographic (EEG) signals provide highly informative data on brain activities and functions. Therefore, it is possible to extract a great variety of features from these data. Problem: The heterogeneity and high dimensionality of the EEG signals may represent an obstacle for data interpretation. The introduction of a priori knowledge has been widely employed to mitigate high dimensionality problems, even though it could lose some information and patterns present in the data. Moreover, data heterogeneity remains an open issue that often makes generalization difficult. Methods: In this study, we propose the adoption of a Genetic Algorithm (GA) for feature selection, where we introduced a series of modifications on the stopping criteria and fitness functions only and that can be used with a supervised or unsupervised approach. Our proposal considers three different fitness functions without relying on expert knowledge. Starting from two publicly available datasets on cognitive workload and motor movement/imagery, the EEG signals are processed, normalized and their features computed in the time, frequency and time–frequency domains. The feature vector selection is performed by applying our GA proposal and compared with two benchmarking techniques, i.e., using the entire feature set and reducing it through principal component analysis. Results & Conclusions: Our proposal experiments achieve better results in respect to the benchmark in terms of overall performance and feature reduction. Moreover, the application of our novel fitness function outperforms the benchmark when the two considered datasets are merged together, showing the effectiveness of our proposal on heterogeneous data. The selected features are compliant with the neuroscientific literature regarding the considered experimental conditions. Future works will focus on providing a better scoring for the unsupervised technique, the hybrid use of the two approaches and the optimization of the GA parameters.

Saibene, A., Gasparini, F. (2023). Genetic algorithm for feature selection of EEG heterogeneous data. EXPERT SYSTEMS WITH APPLICATIONS, 217(1 May 2023) [10.1016/j.eswa.2022.119488].

Genetic algorithm for feature selection of EEG heterogeneous data

Saibene, Aurora
Primo
;
Gasparini, Francesca
Ultimo
2023

Abstract

Overview: The electroencephalographic (EEG) signals provide highly informative data on brain activities and functions. Therefore, it is possible to extract a great variety of features from these data. Problem: The heterogeneity and high dimensionality of the EEG signals may represent an obstacle for data interpretation. The introduction of a priori knowledge has been widely employed to mitigate high dimensionality problems, even though it could lose some information and patterns present in the data. Moreover, data heterogeneity remains an open issue that often makes generalization difficult. Methods: In this study, we propose the adoption of a Genetic Algorithm (GA) for feature selection, where we introduced a series of modifications on the stopping criteria and fitness functions only and that can be used with a supervised or unsupervised approach. Our proposal considers three different fitness functions without relying on expert knowledge. Starting from two publicly available datasets on cognitive workload and motor movement/imagery, the EEG signals are processed, normalized and their features computed in the time, frequency and time–frequency domains. The feature vector selection is performed by applying our GA proposal and compared with two benchmarking techniques, i.e., using the entire feature set and reducing it through principal component analysis. Results & Conclusions: Our proposal experiments achieve better results in respect to the benchmark in terms of overall performance and feature reduction. Moreover, the application of our novel fitness function outperforms the benchmark when the two considered datasets are merged together, showing the effectiveness of our proposal on heterogeneous data. The selected features are compliant with the neuroscientific literature regarding the considered experimental conditions. Future works will focus on providing a better scoring for the unsupervised technique, the hybrid use of the two approaches and the optimization of the GA parameters.
Articolo in rivista - Articolo scientifico
Electroencephalography; Evolutionary feature selection; Genetic algorithm; K-means clustering; Support vector machine
English
10.48550/arXiv.2103.07117
Saibene, A., Gasparini, F. (2023). Genetic algorithm for feature selection of EEG heterogeneous data. EXPERT SYSTEMS WITH APPLICATIONS, 217(1 May 2023) [10.1016/j.eswa.2022.119488].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/400876
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact