Most evaluations of novel algorithmic contributions assess their accuracy in predicting what was withheld in an of- ine evaluation scenario. However, several doubts have been raised that standard offline evaluation practices are not ap- propriate to select the best algorithm for field deployment. The goal of this work is therefore to compare the offline and the online evaluation methodology with the same study participants, i.e. a within users experimental design. This paper presents empirical evidence that the ranking of algo- rithms based on offline accuracy measurements clearly con- tradicts the results from the online study with the same set of users. Thus the external validity of the most commonly applied evaluation methodology is not guaranteed.

Stella, F., Rossetti, M., Zanker, M. (2016). Contrasting offline and online results when evaluating recommendation algorithms. In RecSys 2016 - Proceedings of the 10th ACM Conference on Recommender Systems (pp.31-34). New York : Association for Computing Machinery, Inc [10.1145/2959100.2959176].

Contrasting offline and online results when evaluating recommendation algorithms

STELLA, FABIO ANTONIO
Penultimo
;
2016

Abstract

Most evaluations of novel algorithmic contributions assess their accuracy in predicting what was withheld in an of- ine evaluation scenario. However, several doubts have been raised that standard offline evaluation practices are not ap- propriate to select the best algorithm for field deployment. The goal of this work is therefore to compare the offline and the online evaluation methodology with the same study participants, i.e. a within users experimental design. This paper presents empirical evidence that the ranking of algo- rithms based on offline accuracy measurements clearly con- tradicts the results from the online study with the same set of users. Thus the external validity of the most commonly applied evaluation methodology is not guaranteed.
paper
User study, Evaluation methodology, Experim ental within users design
English
ACM Conference on Recommender Systems, RecSys 15-19 September
2016
RecSys 2016 - Proceedings of the 10th ACM Conference on Recommender Systems
9781450340359
2016
31
34
reserved
Stella, F., Rossetti, M., Zanker, M. (2016). Contrasting offline and online results when evaluating recommendation algorithms. In RecSys 2016 - Proceedings of the 10th ACM Conference on Recommender Systems (pp.31-34). New York : Association for Computing Machinery, Inc [10.1145/2959100.2959176].
File in questo prodotto:
File Dimensione Formato  
2016 - Proceedings - RecSys2016.pdf

Solo gestori archivio

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Dimensione 799.62 kB
Formato Adobe PDF
799.62 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/132276
Citazioni
  • Scopus 62
  • ???jsp.display-item.citation.isi??? 44
Social impact