Sepsis remains a leading cause of mortality worldwide, with one in five deaths attributed to its rapid and heterogeneous progression. Early diagnosis is hindered by the absence of reliable molecular biomarkers. To address this, we developed an integrative computational framework that combines supervised machine learning, statistical testing, and cross-dataset reproducibility analysis to identify robust transcriptomic signatures of sepsis. From 55 experimentally validated sepsis-related genes curated from the literature, our pipeline selected a 15-gene diagnostic panel achieving consistent high performance across 11 independent gene expression omnibus datasets (Matthews Correlation Coefficient = 0.94; AUROC = 0.99). Feature ranking revealed four key predictors (CD177, S100A12, S100A8, and GATA3) that were repeatedly prioritized across independent datasets. Functional enrichment and network analysis further organized these genes into three coherent biological modules associated with neutrophil activation, immune suppression, and calgranulin-mediated inflammatory signaling. Benchmarking against established clinical panels confirmed superior accuracy, reproducibility, and interpretability. This study provides a reproducible, data-driven framework for biomarker discovery and a biologically grounded 15-gene panel for precision diagnostics in sepsis.

Johari, T., Chicco, D. (2026). A modular transcriptomic signature paired with machine learning reveals core immune pathways in sepsis diagnosis. DISCOVER COMPUTING, 29(1) [10.1007/s10791-026-09974-2].

A modular transcriptomic signature paired with machine learning reveals core immune pathways in sepsis diagnosis

Chicco D.
Ultimo
2026

Abstract

Sepsis remains a leading cause of mortality worldwide, with one in five deaths attributed to its rapid and heterogeneous progression. Early diagnosis is hindered by the absence of reliable molecular biomarkers. To address this, we developed an integrative computational framework that combines supervised machine learning, statistical testing, and cross-dataset reproducibility analysis to identify robust transcriptomic signatures of sepsis. From 55 experimentally validated sepsis-related genes curated from the literature, our pipeline selected a 15-gene diagnostic panel achieving consistent high performance across 11 independent gene expression omnibus datasets (Matthews Correlation Coefficient = 0.94; AUROC = 0.99). Feature ranking revealed four key predictors (CD177, S100A12, S100A8, and GATA3) that were repeatedly prioritized across independent datasets. Functional enrichment and network analysis further organized these genes into three coherent biological modules associated with neutrophil activation, immune suppression, and calgranulin-mediated inflammatory signaling. Benchmarking against established clinical panels confirmed superior accuracy, reproducibility, and interpretability. This study provides a reproducible, data-driven framework for biomarker discovery and a biologically grounded 15-gene panel for precision diagnostics in sepsis.
Articolo in rivista - Articolo scientifico
Biomarker discovery; Feature importance; Gene expression; Machine learning; Pathway enrichment; Random forest; Sepsis; Transcriptomics;
English
19-feb-2026
2026
29
1
107
open
Johari, T., Chicco, D. (2026). A modular transcriptomic signature paired with machine learning reveals core immune pathways in sepsis diagnosis. DISCOVER COMPUTING, 29(1) [10.1007/s10791-026-09974-2].
File in questo prodotto:
File Dimensione Formato  
Johari-Chicco-2026-Discover Computing-VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 2.91 MB
Formato Adobe PDF
2.91 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/604924
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
Social impact