Recently, there has been a resurgence of interest in rigorous and scalable algorithms for efficient inference of cancer progression using genomic patient data. The motivations are manifold: (i) rapidly growing NGS and single cell data from cancer patients, (ii) long-felt need for novel Data Science and Machine Learning algorithms well-suited for inferring models of cancer progression, and finally, (iii) a desire to understand the temporal and heterogeneous structure of tumor so as to tame its natural progression through most efficacious therapeutic intervention. This thesis presents a multi-disciplinary effort to algorithmically and efficiently model tumor progression involving successive accumulation of genetic alterations, each resulting populations manifesting themselves with a novel cancer phenotype. The framework presented in this work along with efficient algorithms derived from it, represents a novel and versatile approach for inferring cancer progression, whose accuracy and convergence rates surpass other existing techniques. The approach derives its power from many insights from, and contributes to, several fields including algorithms in machine learning, theory of causality, and cancer biology. Furthermore, an optimal, versatile and modular pipeline to extract ensemble-level progression models from cross-sectional sequenced cancer genomes is also proposed. The pipeline combines state-of-the-art techniques for sample stratification, driver selection, identification of fitness-equivalent exclusive alterations and progression model inference. Finally, the results are rigorously validated using synthetic data created with realistic generative models, and empirically interpreted in the context of real cancer datasets; in the later case, biologically significant conclusions revealed by the reconstructed progressions are also highlighted. Specifically, the pipeline's ability to reproduce much of the current knowledge on colorectal cancer progression, as well as to suggest novel experimentally verifiable hypotheses is also demonstrate. Lastly, it is also proved that the proposed framework can be applied, mutatis mutandis, in reconstructing the evolutionary history of cancer clones in single patients, as illustrated by an example with multiple biopsy data from clear cell renal carcinomas.
(2016). A Model of Selective Advantage for the Efficient Inference of Cancer Clonal Evolution. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2016).
A Model of Selective Advantage for the Efficient Inference of Cancer Clonal Evolution
RAMAZZOTTI, DANIELE
2016
Abstract
Recently, there has been a resurgence of interest in rigorous and scalable algorithms for efficient inference of cancer progression using genomic patient data. The motivations are manifold: (i) rapidly growing NGS and single cell data from cancer patients, (ii) long-felt need for novel Data Science and Machine Learning algorithms well-suited for inferring models of cancer progression, and finally, (iii) a desire to understand the temporal and heterogeneous structure of tumor so as to tame its natural progression through most efficacious therapeutic intervention. This thesis presents a multi-disciplinary effort to algorithmically and efficiently model tumor progression involving successive accumulation of genetic alterations, each resulting populations manifesting themselves with a novel cancer phenotype. The framework presented in this work along with efficient algorithms derived from it, represents a novel and versatile approach for inferring cancer progression, whose accuracy and convergence rates surpass other existing techniques. The approach derives its power from many insights from, and contributes to, several fields including algorithms in machine learning, theory of causality, and cancer biology. Furthermore, an optimal, versatile and modular pipeline to extract ensemble-level progression models from cross-sectional sequenced cancer genomes is also proposed. The pipeline combines state-of-the-art techniques for sample stratification, driver selection, identification of fitness-equivalent exclusive alterations and progression model inference. Finally, the results are rigorously validated using synthetic data created with realistic generative models, and empirically interpreted in the context of real cancer datasets; in the later case, biologically significant conclusions revealed by the reconstructed progressions are also highlighted. Specifically, the pipeline's ability to reproduce much of the current knowledge on colorectal cancer progression, as well as to suggest novel experimentally verifiable hypotheses is also demonstrate. Lastly, it is also proved that the proposed framework can be applied, mutatis mutandis, in reconstructing the evolutionary history of cancer clones in single patients, as illustrated by an example with multiple biopsy data from clear cell renal carcinomas.File | Dimensione | Formato | |
---|---|---|---|
phd_unimib_725339.pdf
accesso aperto
Descrizione: Tesi dottorato
Tipologia di allegato:
Doctoral thesis
Dimensione
37.6 MB
Formato
Adobe PDF
|
37.6 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.