Hepatitis C is an infectious disease that affects more than 70 million people worldwide, even killing 400 thousand of them annually. To better understand this disease and its prognosis, medical doctors can take advantage of the electronic health records (EHRs) of patients, which contain data that computerbased approaches built on statistics and computational intelligence can process to unveil new discoveries and trends otherwise unnoticeable by physicians. In this study, we analyze EHRs of 540 healthy controls and 75 patients diagnosed with hepatitis C, and use machine learning classi_ers to predict their diagnosis. We employ the top classi_er (Random Forests) to detect the most diagnostic variables for hepatitis C, that result being aspartate aminotransferase (AST) and alanine aminotransferase (ALT). These two enzyme levels are also employed by physicians in the AST/ALT ratio, a traditional measure commonly employed in gastroenterology and hepatology. We apply the same approach to a validation dataset of 123 patients with hepatitis C and cirrhosis, and the same two variables arose as most relevant. We therefore compared our approach with the AST/ALT ratio, and noticed that our two-features ensemble learning model outperforms the traditional AST/ALT ratio on both datasets. Our results con_rm the usefulness of ensemble machine learning for hepatitis C and cirrhosis diagnosis prediction. Moreover, our discoveries can have an impact on clinical practice, helping physicians predict diagnoses of patients at risk of hepatitis C and cirrhosis more precisely.
Chicco, D., Jurman, G. (2021). An ensemble learning approach for enhanced classification of patients with hepatitis and cirrhosis. IEEE ACCESS, 9, 24485-24498 [10.1109/ACCESS.2021.3057196].
An ensemble learning approach for enhanced classification of patients with hepatitis and cirrhosis
Chicco, D
Primo
;
2021
Abstract
Hepatitis C is an infectious disease that affects more than 70 million people worldwide, even killing 400 thousand of them annually. To better understand this disease and its prognosis, medical doctors can take advantage of the electronic health records (EHRs) of patients, which contain data that computerbased approaches built on statistics and computational intelligence can process to unveil new discoveries and trends otherwise unnoticeable by physicians. In this study, we analyze EHRs of 540 healthy controls and 75 patients diagnosed with hepatitis C, and use machine learning classi_ers to predict their diagnosis. We employ the top classi_er (Random Forests) to detect the most diagnostic variables for hepatitis C, that result being aspartate aminotransferase (AST) and alanine aminotransferase (ALT). These two enzyme levels are also employed by physicians in the AST/ALT ratio, a traditional measure commonly employed in gastroenterology and hepatology. We apply the same approach to a validation dataset of 123 patients with hepatitis C and cirrhosis, and the same two variables arose as most relevant. We therefore compared our approach with the AST/ALT ratio, and noticed that our two-features ensemble learning model outperforms the traditional AST/ALT ratio on both datasets. Our results con_rm the usefulness of ensemble machine learning for hepatitis C and cirrhosis diagnosis prediction. Moreover, our discoveries can have an impact on clinical practice, helping physicians predict diagnoses of patients at risk of hepatitis C and cirrhosis more precisely.File | Dimensione | Formato | |
---|---|---|---|
Chicco-2021-IEEE Access-VoR.pdf
accesso aperto
Descrizione: Article
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
1.48 MB
Formato
Adobe PDF
|
1.48 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.