Bicocca Open Archive

Most evaluations of novel algorithmic contributions assess their accuracy in predicting what was withheld in an of- ine evaluation scenario. However, several doubts have been raised that standard offline evaluation practices are not ap- propriate to select the best algorithm for field deployment. The goal of this work is therefore to compare the offline and the online evaluation methodology with the same study participants, i.e. a within users experimental design. This paper presents empirical evidence that the ranking of algo- rithms based on offline accuracy measurements clearly con- tradicts the results from the online study with the same set of users. Thus the external validity of the most commonly applied evaluation methodology is not guaranteed.

Stella, F., Rossetti, M., Zanker, M. (2016). Contrasting offline and online results when evaluating recommendation algorithms. In RecSys 2016 - Proceedings of the 10th ACM Conference on Recommender Systems (pp.31-34). New York : Association for Computing Machinery, Inc [10.1145/2959100.2959176].

Contrasting offline and online results when evaluating recommendation algorithms

STELLA, FABIO ANTONIO^Penultimo;Rossetti, M;Zanker, M.

2016

Abstract

Most evaluations of novel algorithmic contributions assess their accuracy in predicting what was withheld in an of- ine evaluation scenario. However, several doubts have been raised that standard offline evaluation practices are not ap- propriate to select the best algorithm for field deployment. The goal of this work is therefore to compare the offline and the online evaluation methodology with the same study participants, i.e. a within users experimental design. This paper presents empirical evidence that the ranking of algo- rithms based on offline accuracy measurements clearly con- tradicts the results from the online study with the same set of users. Thus the external validity of the most commonly applied evaluation methodology is not guaranteed.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
			paper
		
	Parole chiave
	
			User study, Evaluation methodology, Experim ental within users design
		
	Lingua del contenuto
	
			English
		
	Nome del convegno
	
			ACM Conference on Recommender Systems, RecSys 15-19 September
		
	Anno del convegno
	
			2016
		
	Titolo degli atti
	
			RecSys 2016 - Proceedings of the 10th ACM Conference on Recommender Systems
		
	ISBN del volume degli atti
	
			9781450340359
		
	Data di pubblicazione
	
			2016
		
	Pagina iniziale
	
			31
		
	Pagina finale
	
			34
		
	DOI dell'intervento
	
			https://dx.doi.org/10.1145/2959100.2959176
		
	Fulltext
	
			reserved
		
	Citazione
	
			Stella, F., Rossetti, M., Zanker, M. (2016). Contrasting offline and online results when evaluating recommendation algorithms. In RecSys 2016 - Proceedings of the 10th ACM Conference on Recommender Systems (pp.31-34). New York : Association for Computing Machinery, Inc [10.1145/2959100.2959176].
		
	Appare nelle tipologie:
	
			02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
2016 - Proceedings - RecSys2016.pdf Solo gestori archivio Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Dimensione 799.62 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	799.62 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/132276

Citazioni

62

44

Social impact