Background: The availability of large epidemiological or clinical data storing biological samples allow to study the prognostic value of novel biomarkers, but efficient designs are needed to select a subsample on which to measure them, for parsimony and economical reasons. Two-phase stratified sampling is a flexible approach to perform such sub-sampling, but literature on stratification variables to be used in the sampling and power evaluation is lacking especially for survival data. Methods: We compared the performance of different sampling designs to assess the prognostic value of a new biomarker on a time-to-event endpoint, applying a Cox model weighted by the inverse of the empirical inclusion probability. Results: Our simulation results suggest that case-control stratified (or post stratified) by a surrogate variable of the marker can yield higher performances than simple random, probability proportional to size, and case-control sampling. In the presence of high censoring rate, results showed an advantage of nested case-control and counter-matching designs in term of design effect, although the use of a fixed ratio between cases and controls might be disadvantageous. On real data on childhood acute lymphoblastic leukemia, we found that optimal sampling using pilot data is greatly efficient. Conclusions: Our study suggests that, in our sample, case-control stratified by surrogate and nested case-control yield estimates and power comparable to estimates obtained in the full cohort while strongly decreasing the number of patients required. We recommend to plan the sample size and using sampling designs for exploration of novel biomarker in clinical cohort data.

Graziano, F., Valsecchi, M., Rebora, P. (2021). Sampling strategies to evaluate the prognostic value of a new biomarker on a time-to-event end-point. BMC MEDICAL RESEARCH METHODOLOGY, 21(1) [10.1186/s12874-021-01283-0].

Sampling strategies to evaluate the prognostic value of a new biomarker on a time-to-event end-point

Graziano F.
Primo
;
Valsecchi M. G.;Rebora P.
Ultimo
2021

Abstract

Background: The availability of large epidemiological or clinical data storing biological samples allow to study the prognostic value of novel biomarkers, but efficient designs are needed to select a subsample on which to measure them, for parsimony and economical reasons. Two-phase stratified sampling is a flexible approach to perform such sub-sampling, but literature on stratification variables to be used in the sampling and power evaluation is lacking especially for survival data. Methods: We compared the performance of different sampling designs to assess the prognostic value of a new biomarker on a time-to-event endpoint, applying a Cox model weighted by the inverse of the empirical inclusion probability. Results: Our simulation results suggest that case-control stratified (or post stratified) by a surrogate variable of the marker can yield higher performances than simple random, probability proportional to size, and case-control sampling. In the presence of high censoring rate, results showed an advantage of nested case-control and counter-matching designs in term of design effect, although the use of a fixed ratio between cases and controls might be disadvantageous. On real data on childhood acute lymphoblastic leukemia, we found that optimal sampling using pilot data is greatly efficient. Conclusions: Our study suggests that, in our sample, case-control stratified by surrogate and nested case-control yield estimates and power comparable to estimates obtained in the full cohort while strongly decreasing the number of patients required. We recommend to plan the sample size and using sampling designs for exploration of novel biomarker in clinical cohort data.
Articolo in rivista - Articolo scientifico
Case-control design; Cohort studies; Power; Two-phase sampling; Weighted cox model;
English
30-apr-2021
2021
21
1
93
open
Graziano, F., Valsecchi, M., Rebora, P. (2021). Sampling strategies to evaluate the prognostic value of a new biomarker on a time-to-event end-point. BMC MEDICAL RESEARCH METHODOLOGY, 21(1) [10.1186/s12874-021-01283-0].
File in questo prodotto:
File Dimensione Formato  
Graziano2phaseBMC2021s12874-021-01283-0.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Dimensione 945.55 kB
Formato Adobe PDF
945.55 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/321894
Citazioni
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 5
Social impact