Cut-pont finding methods for continuous biomarkers

Rota, M

My PhD dissertation deals with statistical methods for cut-point finding for continuous biomarkers. Categorization is often needed for clinical decision making when dealing with diagnostic (or prognostic) biomarkers and a dichotomous or censored failure time outcome. This allows the definition of two or more prognostic risk groups, or also patient’s stratifications for inclusion in randomized clinical trials (RCTs). We investigate the following cut-point finding methods: minimum P-value, Youden index, concordance probability and point closest to-(0,1) corner in the ROC plane. We compare them by assuming both Normal and Gamma biomarker distributions, showing whether they lead to the identification of the same true cut-point and further investigating their performance by simulation. Within the framework of censored survival data, we will consider here new estimation approaches of the optimal cut-point, which use a conditional weighting method to estimate the true positive and false positive fractions. Motivating examples on real datasets are discussed within the dissertation for both the dichotomous and censored failure time outcome. In all simulation scenarios, the point closest-to-(0,1) corner in the ROC plane and concordance probability approaches outperformed the other methods. Both these methods showed good performance in the estimation of the optimal cut-point of a biomarker. However, to improve results communicability, the Youden index or the concordance probability associated to the estimated cut-point could be reported to summarize the associated classification accuracy. The use of the minimum P-value approach for cut-point finding is not recommended because its objective function is computed under the null hypothesis of absence of association between the true disease status and X. This is in contrast with the presence of some discrimination potential of the biomarker X that leads to the dichotomization issue. The investigated cut-point finding methods are based on measures, i.e. sensitivity and specificity, defined conditionally on the outcome. My PhD dissertation opens the question on whether these methods could be applied starting from predictive values, that typically represent the most useful information for clinical decisions on treatments. However, while sensitivity and specificity are invariant to disease prevalence, predictive values vary across populations with different disease prevalence. This is an important drawback of the use of predictive values for cut-point finding. More in general, great care should be taken when establishing a biomarker cut-point for clinical use. Methods for categorizing new biomarkers are often essential in clinical decision-making even if categorization of a continuous biomarker is gained at a considerable loss of power and information. In the future, new methods involving the study of the functional form between the biomarker and the outcome through regression techniques such as fractional polynomials or spline functions should be considered to alternatively define cut-points for clinical use. Moreover, in spite of the aforementioned drawback related to the use of predictive values, we also think that additional new methods for cut-point finding should be developed starting from predictive values.

(2013). Cut-pont finding methods for continuous biomarkers. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2013).