In clinical studies it is often of interest to compare the survival experience of patients in two or more treatment groups. In some situations the categorization is not fixed at baseline but changes during the follow-up, where patients, for example, start from an initial treatment and either continue it or switch to an alternative one after some time (waiting time). Thus, treatment is a binary non reversible time-dependent variable. A typical problem is comparing outcomes of chemotherapy vs stem-cell transplantation in Acute Lymphoblastic Leukemia (ALL) where patients are treated initially with chemotherapy and during the follow-up they can receive bone marrow transplant. In this context, the standard Kaplan-Meier method is unreliable since it is affected by the immortal time bias. Two alternative non-parametric approaches were proposed in the literature. Andersen et al. (1983) suggests to classify patients at a landmark time which corresponds to the starting point for the estimation of the Kaplan-Meier survival curve, involving only patients still at risk at the landmark. The second, proposed by Simon and Makuch (1984), consists in dynamically updating in time the risk set of the two time-dependent treatment groups. Both methods were presented mostly relying on heuristic bases and without specifying the theoretical quantities corresponding to the proposed estimators. Thus, the interpretations of the curves estimated by the two methods was never clarified. When the focus is not on the overall survival experience but rather on profile-specific prediction, i.e. accounting for the individual characteristics of the subjects, one must resort to semi-parametric or parametric regression models. The Cox model is the most popular one but in the presence of time-varying effects and/or time-dependent covariates it cannot be used to obtain survival curves. Among the possible alternatives we considered the full parametric model by Hanley and Miettinen (2009) and the semi-parametric landmark regression model by Van Houwelingen (2007). The first is based on estimating the hazard function over time by applying a logistic regression to an expanded dataset created by splitting the observed survival time of each subject into a number of time-units and to treat the number of events in every single interval as a Binomial random variable. The second originates from the idea of fitting the Cox model to multiple subsets of data, each one created starting from a sliding landmark time point and including only the subjects at risk at the landmark; the treatment classification for these patients is frozen at that time allowing to dynamically update the time-dependent covariates in each model and to let the parameter estimates to vary in time. The aims of the dissertation are reviewing and developing methods for: 1) the description of the survival experience according to a binary time-dependent treatment indicator both from a fixed perspective and dynamically update in time; 2) the assessment of the impact on these quantities of prognostic factors, in particular the waiting time to transplant, through interpretable parameters; 3) the development of profile-specific predictions. In the first part of this work we wish to clarify the theoretical quantities estimated by the landmark and Simon-Makuch methods. In addition, we present a novel approach based on counterfactual questions and dynamic prediction, checking the validity of our findings using simulations. In the second part, we review the Hanley-Miettinen and landmark regression models and we show how to use them to properly estimate the effect of waiting time to transplant and to make profile-specific dynamic predictions on a real dataset on ALL, comparing the performance of the two models using simulations.

Negli studi clinici spesso è di interesse confrontare la sopravvivenza di pazienti appartenenti a due o più gruppi di trattamento. In alcune situazioni, la classificazione non è effettuata all’inizio del follow-up ma cambia nel tempo. Ad esempio, tutti i pazienti sono sottoposti ad un trattamento iniziale ed alcuni lo continuano mentre altri cambiano dopo un certo periodo di tempo. In questo caso il trattamento è rappresentato da una variabile binaria tempo-dipendente. Un contesto tipico è il confronto tra chemioterapia e trapianto di cellule staminali nella Leucemia Linfoblastica Acuta. In questa situazione, il metodo Kaplan-Meier non è utilizzabile in quanto affetto da immortal time bias. Due approcci non-parametrici alternativi sono stati proposti in letteratura. Andersen et al. (1983) suggeriscono di classificare i pazienti ad un tempo “landmark” che corrisponde al punto iniziale della stima della curva di sopravvivenza, includendo solo i pazienti ancora a rischio al landmark. Il secondo metodo, proposto da Simon e Makuch (1984), consiste nell’aggiornamento dinamico dei “risk sets” dei due gruppi di trattamento tempo-dipendenti. Entrambi i metodi sono stati presentati in maniera euristica e senza specificare le quantità teoriche che corrispondono agli stimatori proposti. Perciò, l’interpretazione delle curve stimate dai due metodi non è mai stata chiarita. Quando l’interesse non è rivolto alla sopravvivenza globale ma alla predizione profilo-specifica, ovvero tenendo conto delle caratteristiche individuali dei soggetti, occorre utilizzare metodi di regressione parametrici o semi-parametrici. Il modello di Cox è quello più popolare ma in presenza di effetti tempo-dipendenti e/o di covariate tempo-dipendenti non può essere utilizzato per ottenere delle curve. Tra le possibili alternative sono stati considerati il modello parametrico di Hanley e Miettinen (2009) e il modello di regressione semi-parametrico basato sul landmark di Van Houwelingen (2007). Il primo è basato sulla stima della funzione azzardo nel tempo applicando una regressione logistica ad un dataset esteso creato dalla suddivisione del tempo di sopravvivenza osservato di ciascun soggetto in un certo numero di unità di tempo e trattando il numero di eventi in ogni singolo intervallo di tempo come una variabile casuale Binomiale. Il secondo metodo scaturisce dall’idea di utilizzare il modello di Cox su molteplici partizioni del dataset ciascuna creata partendo da un tempo landmark progressivo e includendo solo i soggetti a rischio al landmark; la classificazione del trattamento per questi pazienti è fissata a quel tempo consentendo di aggiornare dinamicamente il valore delle covariate tempo-dipendenti in ciascun modello e permettendo ai coefficienti stimati di variare nel tempo. Gli scopi del presente lavoro sono la revisione e lo sviluppo di metodi per: 1) descrivere la sopravvivenza in funzione di un covariata binaria tempo-dipendente sia da una prospettiva fissa sia dinamicamente nel tempo; 2) la valutazione dell’impatto su queste quantità dei fattori prognostici, in particolare il tempo di attesa al trapianto, utilizzando dei parametri interpretabili; 3) lo sviluppo di predizioni profilo-specifiche. Nella prima parte del lavoro si intende chiarire il significato delle le quantità teoriche stimate dai metodi landmark e Simon e Makuch. In aggiunta, si presenta un approccio innovativo basato su domande controfattuali e predizione dinamica, verificando la validità dei risultati attraverso delle simulazioni. Nella seconda parte, si presentano i modelli di regressione di Hanley-Miettinen e del landmark e si mostra come utilizzarli per ottenere la stima dell’effetto del tempo i attesa al trapianto e per produrre delle predizioni profilo-specifiche su dati reali inerenti a pazienti affetti da Leucemia Linfoblastica Acuta, confrontando la performance dei modelli attraverso delle simulazioni.

(2015). Dynamic prediction in survival analysis with binary non-reversible time-dependent treatment indicator. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2015).

Dynamic prediction in survival analysis with binary non-reversible time-dependent treatment indicator

BERNASCONI, DAVIDE PAOLO
2015

Abstract

In clinical studies it is often of interest to compare the survival experience of patients in two or more treatment groups. In some situations the categorization is not fixed at baseline but changes during the follow-up, where patients, for example, start from an initial treatment and either continue it or switch to an alternative one after some time (waiting time). Thus, treatment is a binary non reversible time-dependent variable. A typical problem is comparing outcomes of chemotherapy vs stem-cell transplantation in Acute Lymphoblastic Leukemia (ALL) where patients are treated initially with chemotherapy and during the follow-up they can receive bone marrow transplant. In this context, the standard Kaplan-Meier method is unreliable since it is affected by the immortal time bias. Two alternative non-parametric approaches were proposed in the literature. Andersen et al. (1983) suggests to classify patients at a landmark time which corresponds to the starting point for the estimation of the Kaplan-Meier survival curve, involving only patients still at risk at the landmark. The second, proposed by Simon and Makuch (1984), consists in dynamically updating in time the risk set of the two time-dependent treatment groups. Both methods were presented mostly relying on heuristic bases and without specifying the theoretical quantities corresponding to the proposed estimators. Thus, the interpretations of the curves estimated by the two methods was never clarified. When the focus is not on the overall survival experience but rather on profile-specific prediction, i.e. accounting for the individual characteristics of the subjects, one must resort to semi-parametric or parametric regression models. The Cox model is the most popular one but in the presence of time-varying effects and/or time-dependent covariates it cannot be used to obtain survival curves. Among the possible alternatives we considered the full parametric model by Hanley and Miettinen (2009) and the semi-parametric landmark regression model by Van Houwelingen (2007). The first is based on estimating the hazard function over time by applying a logistic regression to an expanded dataset created by splitting the observed survival time of each subject into a number of time-units and to treat the number of events in every single interval as a Binomial random variable. The second originates from the idea of fitting the Cox model to multiple subsets of data, each one created starting from a sliding landmark time point and including only the subjects at risk at the landmark; the treatment classification for these patients is frozen at that time allowing to dynamically update the time-dependent covariates in each model and to let the parameter estimates to vary in time. The aims of the dissertation are reviewing and developing methods for: 1) the description of the survival experience according to a binary time-dependent treatment indicator both from a fixed perspective and dynamically update in time; 2) the assessment of the impact on these quantities of prognostic factors, in particular the waiting time to transplant, through interpretable parameters; 3) the development of profile-specific predictions. In the first part of this work we wish to clarify the theoretical quantities estimated by the landmark and Simon-Makuch methods. In addition, we present a novel approach based on counterfactual questions and dynamic prediction, checking the validity of our findings using simulations. In the second part, we review the Hanley-Miettinen and landmark regression models and we show how to use them to properly estimate the effect of waiting time to transplant and to make profile-specific dynamic predictions on a real dataset on ALL, comparing the performance of the two models using simulations.
ANTOLINI, LAURA
Survival analysis; Kaplan-Meier; Time-dependent treatment; Landmark; Immortal time bias; Dynamic prediction; Flexible regression models
MED/01 - STATISTICA MEDICA
English
5-mar-2015
Scuola di Dottorato in Scienze Mediche Sperimentali e Cliniche
EPIDEMIOLOGIA E BIOSTATISTICA - 64R
27
2013/2014
open
(2015). Dynamic prediction in survival analysis with binary non-reversible time-dependent treatment indicator. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2015).
File in questo prodotto:
File Dimensione Formato  
phd_unimib_072743.pdf

Accesso Aperto

Descrizione: Tesi dottorato
Tipologia di allegato: Doctoral thesis
Dimensione 1.49 MB
Formato Adobe PDF
1.49 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/76772
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact