This paper investigates the decision making process aided by machine learning for biomedical problems and how to improve it through meta assessments of the most relevant features. Classification algorithms are usually trained and exploited with high dimensional datasets (i.e., with an extremely large number of features), which is inefficient and costly. It would be beneficial to identify the most meaningful features that contribute the most to assigning a category to a subject, and in particular, diagnosing a pathological condition. A helpful support can come from cooperative game theory, through the computation of the Shapley value, an indicator of desirable properties according to which the players, in our case the input features, can be ranked. We apply such a framework to a supervised machine learning scenario of a random forest tree classifier applied to heart disease detection. From a publicly available dataset, we identify the most relevant features that can affect the decision, thus obtaining practical guidelines for a compact yet efficient description based on an analytical rationale.

Scapin, D., Cisotto, G., Gindullina, E., Badia, L. (2022). Shapley Value as an Aid to Biomedical Machine Learning: a Heart Disease Dataset Analysis. In Proceedings - 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2022 (pp.933-939). Institute of Electrical and Electronics Engineers Inc. [10.1109/CCGrid54584.2022.00113].

Shapley Value as an Aid to Biomedical Machine Learning: a Heart Disease Dataset Analysis

Cisotto, G;
2022

Abstract

This paper investigates the decision making process aided by machine learning for biomedical problems and how to improve it through meta assessments of the most relevant features. Classification algorithms are usually trained and exploited with high dimensional datasets (i.e., with an extremely large number of features), which is inefficient and costly. It would be beneficial to identify the most meaningful features that contribute the most to assigning a category to a subject, and in particular, diagnosing a pathological condition. A helpful support can come from cooperative game theory, through the computation of the Shapley value, an indicator of desirable properties according to which the players, in our case the input features, can be ranked. We apply such a framework to a supervised machine learning scenario of a random forest tree classifier applied to heart disease detection. From a publicly available dataset, we identify the most relevant features that can affect the decision, thus obtaining practical guidelines for a compact yet efficient description based on an analytical rationale.
slide + paper
Artificial intelligence; Game theory; Heart disease; Machine learning; Parametric analysis; Random forest classifier; Shapley value;
English
22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2022 - 16 May 2022 through 19 May 2022
2022
Proceedings - 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2022
9781665499569
2022
933
939
reserved
Scapin, D., Cisotto, G., Gindullina, E., Badia, L. (2022). Shapley Value as an Aid to Biomedical Machine Learning: a Heart Disease Dataset Analysis. In Proceedings - 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2022 (pp.933-939). Institute of Electrical and Electronics Engineers Inc. [10.1109/CCGrid54584.2022.00113].
File in questo prodotto:
File Dimensione Formato  
Scapin-2022-CCGrid-VoR.pdf

Solo gestori archivio

Descrizione: Proceedings - IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID)
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Tutti i diritti riservati
Dimensione 338.55 kB
Formato Adobe PDF
338.55 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/391755
Citazioni
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 5
Social impact