Sepsis remains a leading cause of mortality worldwide, with one in five deaths attributed to its rapid and heterogeneous progression. Early diagnosis is hindered by the absence of reliable molecular biomarkers. To address this, we developed an integrative computational framework that combines supervised machine learning, statistical testing, and cross-dataset reproducibility analysis to identify robust transcriptomic signatures of sepsis. From 55 experimentally validated sepsis-related genes curated from the literature, our pipeline selected a 15-gene diagnostic panel achieving consistent high performance across 11 independent gene expression omnibus datasets (Matthews Correlation Coefficient = 0.94; AUROC = 0.99). Feature ranking revealed four key predictors (CD177, S100A12, S100A8, and GATA3) that were repeatedly prioritized across independent datasets. Functional enrichment and network analysis further organized these genes into three coherent biological modules associated with neutrophil activation, immune suppression, and calgranulin-mediated inflammatory signaling. Benchmarking against established clinical panels confirmed superior accuracy, reproducibility, and interpretability. This study provides a reproducible, data-driven framework for biomarker discovery and a biologically grounded 15-gene panel for precision diagnostics in sepsis.
Johari, T., Chicco, D. (2026). A modular transcriptomic signature paired with machine learning reveals core immune pathways in sepsis diagnosis. DISCOVER COMPUTING, 29(1) [10.1007/s10791-026-09974-2].
A modular transcriptomic signature paired with machine learning reveals core immune pathways in sepsis diagnosis
Chicco D.
Ultimo
2026
Abstract
Sepsis remains a leading cause of mortality worldwide, with one in five deaths attributed to its rapid and heterogeneous progression. Early diagnosis is hindered by the absence of reliable molecular biomarkers. To address this, we developed an integrative computational framework that combines supervised machine learning, statistical testing, and cross-dataset reproducibility analysis to identify robust transcriptomic signatures of sepsis. From 55 experimentally validated sepsis-related genes curated from the literature, our pipeline selected a 15-gene diagnostic panel achieving consistent high performance across 11 independent gene expression omnibus datasets (Matthews Correlation Coefficient = 0.94; AUROC = 0.99). Feature ranking revealed four key predictors (CD177, S100A12, S100A8, and GATA3) that were repeatedly prioritized across independent datasets. Functional enrichment and network analysis further organized these genes into three coherent biological modules associated with neutrophil activation, immune suppression, and calgranulin-mediated inflammatory signaling. Benchmarking against established clinical panels confirmed superior accuracy, reproducibility, and interpretability. This study provides a reproducible, data-driven framework for biomarker discovery and a biologically grounded 15-gene panel for precision diagnostics in sepsis.| File | Dimensione | Formato | |
|---|---|---|---|
|
Johari-Chicco-2026-Discover Computing-VoR.pdf
accesso aperto
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
2.91 MB
Formato
Adobe PDF
|
2.91 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


