Bicocca Open Archive

The reproducibility probability (RP) of a statistically significant outcome is the true power of a statistical test and its estimate is a useful indicator of the stability of the test result. RP-testing consists in testing statistical hypotheses using an RP-estimator as test statistic. In the parametric framework, the RP-based test and the classical one are equivalent, while in the nonparametric one to perform RP-testing is possible only approximately. In this work, we evaluate through a wide simulation study the performances of several semi-parametric and nonparametric RP-estimators (RPEs) for the Wilcoxon rank-sum (WRS) test. RPEs have two tasks: to perform RP-testing and to estimate the RP. To compare RPEs performances we adopt risk indexes (e.g. mean square error (MSE)) and an index of agreement between the outcomes of the WRS test and the RP-based test. Results indicate that the rate of disagreement tends to zero as the sample size increases; the overall rate of disagreement provided by semi-parametric RPEs with finite samples (size 20–200 per group) is 0.15%, and that of nonparametric ones is 0.58%. Concerning risk measures, there is not an RPE dominating the others; for high power values, nonparametric RPEs present the lowest MSE; on average, the semi-parametric RPE based on the upper bound of the variance of the test statistic performs best; nevertheless, the relative gains between the best and the worst are quite small (5–10%). To conclude, well-approximated RP-testing for the WRS test can be performed by adopting a semi-parametric RPE. Since nonparametric plug-in based RPEs perform well in presence of high reproducibility, their adoption is suggested for evaluating the stability of test results and, for example, those of clinical trials.

De Capitani, L., De Martini, D. (2015). Reproducibility probability estimation and testing for the Wilcoxon rank-sum test. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 85(3), 468-493 [10.1080/00949655.2013.825721].

Reproducibility probability estimation and testing for the Wilcoxon rank-sum test

De Capitani, L;De Martini, D

2015

Abstract

The reproducibility probability (RP) of a statistically significant outcome is the true power of a statistical test and its estimate is a useful indicator of the stability of the test result. RP-testing consists in testing statistical hypotheses using an RP-estimator as test statistic. In the parametric framework, the RP-based test and the classical one are equivalent, while in the nonparametric one to perform RP-testing is possible only approximately. In this work, we evaluate through a wide simulation study the performances of several semi-parametric and nonparametric RP-estimators (RPEs) for the Wilcoxon rank-sum (WRS) test. RPEs have two tasks: to perform RP-testing and to estimate the RP. To compare RPEs performances we adopt risk indexes (e.g. mean square error (MSE)) and an index of agreement between the outcomes of the WRS test and the RP-based test. Results indicate that the rate of disagreement tends to zero as the sample size increases; the overall rate of disagreement provided by semi-parametric RPEs with finite samples (size 20–200 per group) is 0.15%, and that of nonparametric ones is 0.58%. Concerning risk measures, there is not an RPE dominating the others; for high power values, nonparametric RPEs present the lowest MSE; on average, the semi-parametric RPE based on the upper bound of the variance of the test statistic performs best; nevertheless, the relative gains between the best and the worst are quite small (5–10%). To conclude, well-approximated RP-testing for the WRS test can be performed by adopting a semi-parametric RPE. Since nonparametric plug-in based RPEs perform well in presence of high reproducibility, their adoption is suggested for evaluating the stability of test results and, for example, those of clinical trials.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				agreement indexes; asymptotic power approximations; plug-in power estimation; power estimation; risk indexes; RP-testing;
			
	Lingua del contenuto
	
				English
			
	Data di pubblicazione
	
				2015
			
	Rivista
	
				JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION
			
	Numero del volume
	
				85
			
	Fascicolo
	
				3
			
	Pagina iniziale
	
				468
			
	Pagina finale
	
				493
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1080/00949655.2013.825721
			
	Fulltext
	
				reserved
			
	Citazione
	
				De Capitani, L., De Martini, D. (2015). Reproducibility probability estimation and testing for the Wilcoxon rank-sum test. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 85(3), 468-493 [10.1080/00949655.2013.825721].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
109562.pdf Solo gestori archivio Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Dimensione 610.61 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	610.61 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/75539

Citazioni

21

20

Social impact