Introduction/Rationale: MALDI-MSI is a powerful technology which enables the molecular composition of a specimen to be evaluated directly in situ. The acquired data has been found to be particularly suitable for clinical purposes and it is hoped that MALDI-MSI can eventually be used to provide diagnostic assistance in particularly difficult cases. The aim of this work is to provide the user with a complete MALDI-MSI data analysis workflow, through an easy-to-use software interface. The end-user should have the capability to perform the import and preprocessing of spectra, feature extraction and selection, model training/tuning and, finally, pixel-by-pixel classification of MS images, with the final output being the molecular classification of bioptic specimens. Methods: The software program makes use of an ensemble of classifiers, rather than a single classification algorithm, to classify patients. Each classifier independently selects the most discriminatory features by discarding invariant and redundant features through the application of the Recursive Feature Elimination (RFE) algorithm. These undergo cross-validation and tuning over a set of parameters, to further assess and maximise the performances. Finally, the classification is performed on all the spectra collected from the analysis of the entire biopsy, with each algorithm contributing to the final report with a vote (unweighed majority vote or Bayesian framework weighed vote): pixels corresponding to spectra are coloured according to the predicted class, by generating a red/green MS segmentation image that resembles the classification. Results: The software has been tested using data acquired following the MALDI-MSI analysis of formalin-fixed paraffin-embedded (FFPE) bioptic specimens. The molecular diagnosis was then correlated and confirmed following histological evaluation performed by a pathologist. The software was able to successfully detect benign and malignant tissue cores within a thyroid Tissue MicroArray (TMA). In order to extract the molecular signature of benignity and malignancy to be applied for classification, the training set was composed of histologically homogeneous tissue cores. The application of this classification highlighted the presence of a heterogeneous core (affected by Papillary Thyroid Carcinoma – PTC). From a molecular standpoint the core was determined to contain both benign and malignant regions, whilst initial histological evaluation indicated the presence of only benign cells. This tissue heterogeneity was then confirmed by further histological evaluation. The software has also been applied for the molecular detection of Medullary Thyroid Carcinoma (MTC). The training set was composed of benign and malignant tissue sub-areas, which were highlighted by the pathologist as being homogeneous in terms of cell composition and type. When tested on clinical MTC tissue specimens, the software was able to successfully highlight the area corresponding to the tumour nodule. Conclusions/Novelty: MALDI-MSI, coupled with the application of state-of-the-art machine learning algorithms, can potentially provide assistance during the diagnostic process by evaluating the molecular alterations in tissue. Consequently, tumour areas without evident morphological changes can be detected, suggesting that a diagnosis could be obtained possibly at earlier stages.

Introduction/Rationale: La potente tecnologia MALDI imaging permette di determinare la composizione molecolare di un campione, direttamente in situ. E’ stato comprovato in letteratura che i dati acquisiti mediante tale tecnologia possono essere utilizzati con successo a scopi clinici, in particolar modo allo scopo di fornire assistenza molecolare in casi in cui la diagnosi cito-morfologica risulti particolarmente difficoltosa. Lo scopo del lavoro di tesi è stato quello di sviluppare un software che, attraverso la sua intuitiva interfaccia grafica, potesse assistere l’utente nell’intero processo di analisi dei dati di MALDI imaging. Il software permette l’importazione e il preprocessamento degli spettri di massa, l’estrazione e la selezione di variabili significative, costruzione di modelli di classificazione, e infine la classificazione a livello di singolo pixel delle immagini molecolari: l’output finale è una immagine segmentata che definisce la classificazione del paziente a livello di aree tissutali. Methods: Il software utilizza più modelli di classificazione simultaneamente: ogni classificatore seleziona in maniera indipendente i segnali maggiormente discriminanti mediante eliminazione delle variabili ridondanti o invarianti, applicando l’algoritmo Recursive Feature Elimination (RFE). Ogni classificatore viene valutato nella sua capacità di classificazione mediante cross-validation e subisce un tuning per massimizzare ulteriormente la potenza di classificazione. Infine, la classificazione è operata su ogni singolo spettro di massa del dataset imaging, derivante dall’analisi di un’intera biopsia. Ogni algoritmo contribuisce al report finale di classificazione con un voto, che può essere pesato sulla base dell’affidabilità dell’algoritmo stesso calcolata secondo il Bayesian framework. I pixel dell’immagine segmentata verranno colorati in verde o rosso sulla base della classificazione dei corrispondenti spettri di massa. Results: Il software è stato applicato su dati di MALDI imaging provenienti dall’analisi di biopsie di tessuti paraffinati (FFPE). La diagnosi molecolare è stata poi correlata e confermata mediante sovrapposizione con sezioni di tessuto colorate secondo le comuni pratiche di istologia valutate dal patologo. Il software è stato in grado di evidenziare la presenza di tessuto benigno e maligno all’interno di un core di un Tissue MicroArray (TMA) tiroideo. Al fine di estrarre la firma molecolare del tessuto benigno e maligno da applicare alla classificazione, il training set è stato composto da soli core che mostravano una omogeneità tissutale a livello istologico. La classificazione tissutale ha permesso di evidenziare la presenza di un core eterogeneo (affetto da Papillary Thyroid Carcinoma – PTC). Da un punto di vista istologico, inizialmente il core era stato indicato essere composto da sole cellule benigne. L’eterogeneità tissutale è stata poi confermata da una più approfondita analisi istologica. Il software è stato inoltre applicato per la diagnosi molecolare di Medullary Thyroid Carcinoma (MTC). Il training set è stato composto da regioni benigne e maligne di tessuto, indicate dal patologo come omogenee in termini di composizione e tipo cellulare. Il software è stato in grado di evidenziare l’area corrispondente al nodulo nelle sezioni tisstali di MTC. Conclusions/Novelty: La tecnologia MALDI imaging, combinata con gli algoritmi di machine learning ad oggi esistenti, può potenzialmente fornire assistenza nel processo di diagnosi, mediante la possibilità di esaltare alterazioni molecolari nel tessuto. Conseguentemente, la tecnologia può evidenziare aree tumorali che non sono associate ad apprezzabili cambiamenti morfologici, suggerendo una possibile implicazione nelle diagnosi a stadi iniziali di patologia.

(2018). AN EASY-TO-USE SOFTWARE PROGRAM FOR THE ENSEMBLE PIXEL-BY-PIXEL CLASSIFICATION OF MALDI-MSI DATASETS. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2018).

AN EASY-TO-USE SOFTWARE PROGRAM FOR THE ENSEMBLE PIXEL-BY-PIXEL CLASSIFICATION OF MALDI-MSI DATASETS

GALLI, MANUEL
2018

Abstract

Introduction/Rationale: MALDI-MSI is a powerful technology which enables the molecular composition of a specimen to be evaluated directly in situ. The acquired data has been found to be particularly suitable for clinical purposes and it is hoped that MALDI-MSI can eventually be used to provide diagnostic assistance in particularly difficult cases. The aim of this work is to provide the user with a complete MALDI-MSI data analysis workflow, through an easy-to-use software interface. The end-user should have the capability to perform the import and preprocessing of spectra, feature extraction and selection, model training/tuning and, finally, pixel-by-pixel classification of MS images, with the final output being the molecular classification of bioptic specimens. Methods: The software program makes use of an ensemble of classifiers, rather than a single classification algorithm, to classify patients. Each classifier independently selects the most discriminatory features by discarding invariant and redundant features through the application of the Recursive Feature Elimination (RFE) algorithm. These undergo cross-validation and tuning over a set of parameters, to further assess and maximise the performances. Finally, the classification is performed on all the spectra collected from the analysis of the entire biopsy, with each algorithm contributing to the final report with a vote (unweighed majority vote or Bayesian framework weighed vote): pixels corresponding to spectra are coloured according to the predicted class, by generating a red/green MS segmentation image that resembles the classification. Results: The software has been tested using data acquired following the MALDI-MSI analysis of formalin-fixed paraffin-embedded (FFPE) bioptic specimens. The molecular diagnosis was then correlated and confirmed following histological evaluation performed by a pathologist. The software was able to successfully detect benign and malignant tissue cores within a thyroid Tissue MicroArray (TMA). In order to extract the molecular signature of benignity and malignancy to be applied for classification, the training set was composed of histologically homogeneous tissue cores. The application of this classification highlighted the presence of a heterogeneous core (affected by Papillary Thyroid Carcinoma – PTC). From a molecular standpoint the core was determined to contain both benign and malignant regions, whilst initial histological evaluation indicated the presence of only benign cells. This tissue heterogeneity was then confirmed by further histological evaluation. The software has also been applied for the molecular detection of Medullary Thyroid Carcinoma (MTC). The training set was composed of benign and malignant tissue sub-areas, which were highlighted by the pathologist as being homogeneous in terms of cell composition and type. When tested on clinical MTC tissue specimens, the software was able to successfully highlight the area corresponding to the tumour nodule. Conclusions/Novelty: MALDI-MSI, coupled with the application of state-of-the-art machine learning algorithms, can potentially provide assistance during the diagnostic process by evaluating the molecular alterations in tissue. Consequently, tumour areas without evident morphological changes can be detected, suggesting that a diagnosis could be obtained possibly at earlier stages.
MAGNI, FULVIO
ZOPPIS, ITALO FRANCESCO
Mass; spectrometry,; Machine; learning,; Classification
Mass; spectrometry,; Machine; learning,; Classification
BIO/10 - BIOCHIMICA
English
20-mar-2018
MEDICINA TRASLAZIONALE E MOLECOLARE - DIMET - 76R
30
2016/2017
open
(2018). AN EASY-TO-USE SOFTWARE PROGRAM FOR THE ENSEMBLE PIXEL-BY-PIXEL CLASSIFICATION OF MALDI-MSI DATASETS. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2018).
File in questo prodotto:
File Dimensione Formato  
phd_unimib_717890.pdf

accesso aperto

Descrizione: tesi di dottorato
Tipologia di allegato: Doctoral thesis
Dimensione 13.66 MB
Formato Adobe PDF
13.66 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/199037
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact