Bicocca Open Archive

Most evaluations of novel algorithmic contributions assess their accuracy in predicting what was withheld in an of- ine evaluation scenario. However, several doubts have been raised that standard offline evaluation practices are not ap- propriate to select the best algorithm for field deployment. The goal of this work is therefore to compare the offline and the online evaluation methodology with the same study participants, i.e. a within users experimental design. This paper presents empirical evidence that the ranking of algo- rithms based on offline accuracy measurements clearly con- tradicts the results from the online study with the same set of users. Thus the external validity of the most commonly applied evaluation methodology is not guaranteed.

Stella, F., Rossetti, M., Zanker, M. (2016). Contrasting offline and online results when evaluating recommendation algorithms. In RecSys 2016 - Proceedings of the 10th ACM Conference on Recommender Systems (pp.31-34). New York : Association for Computing Machinery, Inc [10.1145/2959100.2959176].

Contrasting offline and online results when evaluating recommendation algorithms

STELLA, FABIO ANTONIO^Penultimo;Rossetti, M;Zanker, M.

2016

Abstract

Most evaluations of novel algorithmic contributions assess their accuracy in predicting what was withheld in an of- ine evaluation scenario. However, several doubts have been raised that standard offline evaluation practices are not ap- propriate to select the best algorithm for field deployment. The goal of this work is therefore to compare the offline and the online evaluation methodology with the same study participants, i.e. a within users experimental design. This paper presents empirical evidence that the ranking of algo- rithms based on offline accuracy measurements clearly con- tradicts the results from the online study with the same set of users. Thus the external validity of the most commonly applied evaluation methodology is not guaranteed.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				Evaluation methodology; Experimental within users design; User study;
			
	Parole chiave
	
				User study, Evaluation methodology, Experim ental within users design
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				ACM Conference on Recommender Systems, RecSys 15-19 September
			
	Anno del convegno
	
				2016
			
	Titolo degli atti
	
				RecSys 2016 - Proceedings of the 10th ACM Conference on Recommender Systems
			
	ISBN del volume degli atti
	
				9781450340359
			
	Data di pubblicazione
	
				2016
			
	Pagina iniziale
	
				31
			
	Pagina finale
	
				34
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1145/2959100.2959176
			
	Fulltext
	
				reserved
			
	Citazione
	
				Stella, F., Rossetti, M., Zanker, M. (2016). Contrasting offline and online results when evaluating recommendation algorithms. In RecSys 2016 - Proceedings of the 10th ACM Conference on Recommender Systems (pp.31-34). New York : Association for Computing Machinery, Inc [10.1145/2959100.2959176].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
2016 - Proceedings - RecSys2016.pdf Solo gestori archivio Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Dimensione 799.62 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	799.62 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/132276

Citazioni

70

57

Social impact