Bicocca Open Archive

Recently, there has been a resurgence of interest in rigorous and scalable algorithms for efficient inference of cancer progression using genomic patient data. The motivations are manifold: (i) rapidly growing NGS and single cell data from cancer patients, (ii) long-felt need for novel Data Science and Machine Learning algorithms well-suited for inferring models of cancer progression, and finally, (iii) a desire to understand the temporal and heterogeneous structure of tumor so as to tame its natural progression through most efficacious therapeutic intervention. This thesis presents a multi-disciplinary effort to algorithmically and efficiently model tumor progression involving successive accumulation of genetic alterations, each resulting populations manifesting themselves with a novel cancer phenotype. The framework presented in this work along with efficient algorithms derived from it, represents a novel and versatile approach for inferring cancer progression, whose accuracy and convergence rates surpass other existing techniques. The approach derives its power from many insights from, and contributes to, several fields including algorithms in machine learning, theory of causality, and cancer biology. Furthermore, an optimal, versatile and modular pipeline to extract ensemble-level progression models from cross-sectional sequenced cancer genomes is also proposed. The pipeline combines state-of-the-art techniques for sample stratification, driver selection, identification of fitness-equivalent exclusive alterations and progression model inference. Finally, the results are rigorously validated using synthetic data created with realistic generative models, and empirically interpreted in the context of real cancer datasets; in the later case, biologically significant conclusions revealed by the reconstructed progressions are also highlighted. Specifically, the pipeline's ability to reproduce much of the current knowledge on colorectal cancer progression, as well as to suggest novel experimentally verifiable hypotheses is also demonstrate. Lastly, it is also proved that the proposed framework can be applied, mutatis mutandis, in reconstructing the evolutionary history of cancer clones in single patients, as illustrated by an example with multiple biopsy data from clear cell renal carcinomas.

(2016). A Model of Selective Advantage for the Efficient Inference of Cancer Clonal Evolution. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2016).

A Model of Selective Advantage for the Efficient Inference of Cancer Clonal Evolution

RAMAZZOTTI, DANIELE

2016

Abstract

Recently, there has been a resurgence of interest in rigorous and scalable algorithms for efficient inference of cancer progression using genomic patient data. The motivations are manifold: (i) rapidly growing NGS and single cell data from cancer patients, (ii) long-felt need for novel Data Science and Machine Learning algorithms well-suited for inferring models of cancer progression, and finally, (iii) a desire to understand the temporal and heterogeneous structure of tumor so as to tame its natural progression through most efficacious therapeutic intervention. This thesis presents a multi-disciplinary effort to algorithmically and efficiently model tumor progression involving successive accumulation of genetic alterations, each resulting populations manifesting themselves with a novel cancer phenotype. The framework presented in this work along with efficient algorithms derived from it, represents a novel and versatile approach for inferring cancer progression, whose accuracy and convergence rates surpass other existing techniques. The approach derives its power from many insights from, and contributes to, several fields including algorithms in machine learning, theory of causality, and cancer biology. Furthermore, an optimal, versatile and modular pipeline to extract ensemble-level progression models from cross-sectional sequenced cancer genomes is also proposed. The pipeline combines state-of-the-art techniques for sample stratification, driver selection, identification of fitness-equivalent exclusive alterations and progression model inference. Finally, the results are rigorously validated using synthetic data created with realistic generative models, and empirically interpreted in the context of real cancer datasets; in the later case, biologically significant conclusions revealed by the reconstructed progressions are also highlighted. Specifically, the pipeline's ability to reproduce much of the current knowledge on colorectal cancer progression, as well as to suggest novel experimentally verifiable hypotheses is also demonstrate. Lastly, it is also proved that the proposed framework can be applied, mutatis mutandis, in reconstructing the evolutionary history of cancer clones in single patients, as illustrated by an example with multiple biopsy data from clear cell renal carcinomas.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tutor afferente a Bicocca
	
			STELLA, FABIO ANTONIO
		
	Parole chiave
	
			Bioinformatics; Algorithms; Cancer evolution; Causality; Graphical models; Data analysis; Data mining; Machine learning
		
	* Settore disciplinare della tesi
	
			INF/01 - INFORMATICA
		
	* Lingua del contenuto
	
			English
		
	* Data di discussione
	
			22-feb-2016
		
	* Corso di dottorato
	
			INFORMATICA - 22R
		
	* Ciclo di dottorato
	
			28
		
	* Anno accademico di conseguimento titolo
	
			2014/2015
		
	Fulltext
	
			open
		
	Citazione
	
			(2016). A Model of Selective Advantage for the Efficient Inference of Cancer Clonal Evolution. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2016).
		
	Appare nelle tipologie:
	
			07 - Tesi di dottorato Bicocca post 2009

File in questo prodotto:

File	Dimensione	Formato
phd_unimib_725339.pdf accesso aperto Descrizione: Tesi dottorato Tipologia di allegato: Doctoral thesis Dimensione 37.6 MB Formato Adobe PDF Visualizza/Apri	37.6 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/100453

Citazioni

ND

ND

Social impact