Introduction Matrix-assisted laser desorption/ionization mass spectrometry imaging (MALDI-MSI) enables interrogating the molecular composition of tissue [1]. Its non-destructive nature preserves tissue morphology, allowing the subsequent acquisition of hematoxylin and eosin (H&E) stained whole slide images (WSIs) [2]. In this retrospective study of 184 thyroid cancer patients across 7 tissue microarrays, we integrate MALDI-MSI proteomics, with H&E-stained WSIs, and propose a vision transformer-based deep learning model to learn a joint morpho-molecular representation of histology at matched coordinates. Methods MALDI-MSI data investigate the spatial distribution and relative abundance of tryptic peptides (released after enzymatic digestion with trypsin) in thyroid formalin-fixed paraffin-embedded (FFPE) tissue sections, using a timsTOF fleX MALDI mass spectrometer at a spatial resolution of 20 µm/px. H&E-stained WSIs were acquired using a MIDI II digital scanner with a 40x objective, resulting in a resolution of 0.25 µm/px. We co-registered the two imaging modalities to match grids of 6x6 MALDI-MSI spots with corresponding H&E image patches. We developed a vision transformer-based deep learning model, inspired by M. Shaban et al. [3], that takes as input the joint morpho-molecular signal and learns a representation through self-attention (see Figure 1), to classify each paired image patch. Results/Discussion After preprocessing the MALDI-MSI data (including root mean square normalization, peak picking), we kept 70 m/z features. We evaluated the model performance on an held-out test set of 36 patients, after a 10-fold cross-validation scheme across 148 patients for training and validation, ensuring that samples from the same patient were not split across dataset partitions. The proposed architecture achieved an average classification balanced accuracy of 0.84 ± 0.01 when combining both imaging modalities, outperforming models trained on molecular data alone 0.75 ± 0.03. The number of embedding dimensions was positively correlated with classification performance. In the molecular-only setting the performance initially declines as molecular channels are reduced (100%,75%, 50%), but unexpectedly improves when only 25% are retained, suggesting the presence of noisy or redundant molecular features, which when removed, lead to a more discriminative representation. Conclusion We curated a diverse multimodal dataset of MALDI-MSI proteomics data and H&E-stained WSIs from thyroid FFPE tissue sections. The results prove the added value of morphological features besides molecular ones alone: the proposed neural network effectively learns a joint morpho-molecular representation, while maintaining data efficiency. We plan to aggregate the patch level representations into patient-level ones through multiple instance learning. Novelty We improved downstream performance of the fused morpho-molecular signal through deep learning, opening a new multimodal image-based research line. Impact The multimodal image-based approach led to accurately distinguish indeterminate thyroid lesions, potentially reducing patient overtreatment [4]. References [1] Korber, A., Anthony, I.G., Heeren, R.M., ‘Mass Spectrometry Imaging’, Analytical Chemistry, 97 (2025), pp. 5517-15549. [2] Coelho, V. et al., ‘Improving the Annotation for Spatial Proteomics: A Computational Approach to Enhance Molecular Characterization of Thyroid Nodules’, Journal of Proteome Research, (2026). [3] Shaban, M. et al., ‘A Foundation Model for Spatial Proteomics’, arXiv preprint, arXiv:2506.03373 (2025). [4] Nikiforov, Y. E. et al. “, ‘Nomenclature revision for encapsulated follicular variant of papillary thyroid carcinoma: a paradigm shift to reduce overtreatment of indolent tumors’, JAMA oncology, 2.8 (2016), pp. 1023-1029. Acknowledgment The team from the proteomics and metabolomics unit at UniMiB, guided by Prof. Fulvio Magni. The team from the cancer molecular pathology unit at Fondazione IRCCS San Gerardo dei Tintori, guided by Prof. Fabio Pagni. Funding Fondazione Cariplo (2023-1804).

Coelho, V., L'Imperio, V., Smith, A., Denti, V. (2026). Towards morpho-molecular learning in computational pathology. Intervento presentato a: 21st European Molecular Imaging Meeting - March 24-27 2026, Ljubljana, Slovenia.

Towards morpho-molecular learning in computational pathology

Coelho, V
;
L'Imperio, V;Smith, A;Denti, V
2026

Abstract

Introduction Matrix-assisted laser desorption/ionization mass spectrometry imaging (MALDI-MSI) enables interrogating the molecular composition of tissue [1]. Its non-destructive nature preserves tissue morphology, allowing the subsequent acquisition of hematoxylin and eosin (H&E) stained whole slide images (WSIs) [2]. In this retrospective study of 184 thyroid cancer patients across 7 tissue microarrays, we integrate MALDI-MSI proteomics, with H&E-stained WSIs, and propose a vision transformer-based deep learning model to learn a joint morpho-molecular representation of histology at matched coordinates. Methods MALDI-MSI data investigate the spatial distribution and relative abundance of tryptic peptides (released after enzymatic digestion with trypsin) in thyroid formalin-fixed paraffin-embedded (FFPE) tissue sections, using a timsTOF fleX MALDI mass spectrometer at a spatial resolution of 20 µm/px. H&E-stained WSIs were acquired using a MIDI II digital scanner with a 40x objective, resulting in a resolution of 0.25 µm/px. We co-registered the two imaging modalities to match grids of 6x6 MALDI-MSI spots with corresponding H&E image patches. We developed a vision transformer-based deep learning model, inspired by M. Shaban et al. [3], that takes as input the joint morpho-molecular signal and learns a representation through self-attention (see Figure 1), to classify each paired image patch. Results/Discussion After preprocessing the MALDI-MSI data (including root mean square normalization, peak picking), we kept 70 m/z features. We evaluated the model performance on an held-out test set of 36 patients, after a 10-fold cross-validation scheme across 148 patients for training and validation, ensuring that samples from the same patient were not split across dataset partitions. The proposed architecture achieved an average classification balanced accuracy of 0.84 ± 0.01 when combining both imaging modalities, outperforming models trained on molecular data alone 0.75 ± 0.03. The number of embedding dimensions was positively correlated with classification performance. In the molecular-only setting the performance initially declines as molecular channels are reduced (100%,75%, 50%), but unexpectedly improves when only 25% are retained, suggesting the presence of noisy or redundant molecular features, which when removed, lead to a more discriminative representation. Conclusion We curated a diverse multimodal dataset of MALDI-MSI proteomics data and H&E-stained WSIs from thyroid FFPE tissue sections. The results prove the added value of morphological features besides molecular ones alone: the proposed neural network effectively learns a joint morpho-molecular representation, while maintaining data efficiency. We plan to aggregate the patch level representations into patient-level ones through multiple instance learning. Novelty We improved downstream performance of the fused morpho-molecular signal through deep learning, opening a new multimodal image-based research line. Impact The multimodal image-based approach led to accurately distinguish indeterminate thyroid lesions, potentially reducing patient overtreatment [4]. References [1] Korber, A., Anthony, I.G., Heeren, R.M., ‘Mass Spectrometry Imaging’, Analytical Chemistry, 97 (2025), pp. 5517-15549. [2] Coelho, V. et al., ‘Improving the Annotation for Spatial Proteomics: A Computational Approach to Enhance Molecular Characterization of Thyroid Nodules’, Journal of Proteome Research, (2026). [3] Shaban, M. et al., ‘A Foundation Model for Spatial Proteomics’, arXiv preprint, arXiv:2506.03373 (2025). [4] Nikiforov, Y. E. et al. “, ‘Nomenclature revision for encapsulated follicular variant of papillary thyroid carcinoma: a paradigm shift to reduce overtreatment of indolent tumors’, JAMA oncology, 2.8 (2016), pp. 1023-1029. Acknowledgment The team from the proteomics and metabolomics unit at UniMiB, guided by Prof. Fulvio Magni. The team from the cancer molecular pathology unit at Fondazione IRCCS San Gerardo dei Tintori, guided by Prof. Fabio Pagni. Funding Fondazione Cariplo (2023-1804).
relazione (orale)
MALDI-MSI, WSI, pathology, multimodal artificial intelligence, deep learning
English
21st European Molecular Imaging Meeting - March 24-27 2026
2026
2026
https://e-smi.eu/meetings/emim/2026_ljubljana/
none
Coelho, V., L'Imperio, V., Smith, A., Denti, V. (2026). Towards morpho-molecular learning in computational pathology. Intervento presentato a: 21st European Molecular Imaging Meeting - March 24-27 2026, Ljubljana, Slovenia.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/599621
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact