Bicocca Open Archive

Image quality assessment (IQA) is widely used to evaluate the results of image processing methods. While in recent years the development of objective IQA metrics has seen much progress, there are still many tasks where subjective IQA is significantly more preferred. Using subjective IQA has become even more attractive ever since crowdsourcing platforms such as Amazon Mechanical Turk and Toloka have become available. However, for some specific image processing tasks, there are still some questions related to subjective IQA that have not been solved in a satisfactory way. An example of such a task is the evaluation of image rendering styles where, unlike in the case of distortions, none of the evaluated styles is to be objectively regarded as a priori better or worse. The questions that have not been properly answered up until now are whether the scores for such a task obtained through crowdsourced subjective IQA are reliable and whether they remain stable, i.e., similar if the evaluation is repeated over time. To answer these questions, in this paper first several images and styles are selected and defined, they are then evaluated by using crowdsourced subjective IQA on the Toloka platform, and the obtained scores are numerically analyzed. Experimental results confirm the reliability and stability of the crowdsourced subjective IQA for the problem in question.

Ershov, E., Panshin, A., Ermakov, I., Banić, N., Savchik, A., Bianco, S. (2024). Reliability and Stability of Mean Opinion Score for Image Aesthetic Quality Assessment Obtained Through Crowdsourcing. In Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (pp.365-372). Science and Technology Publications, Lda [10.5220/0012462000003660].

Reliability and Stability of Mean Opinion Score for Image Aesthetic Quality Assessment Obtained Through Crowdsourcing

Ershov, E;Panshin, A;Ermakov, I;Banić, N;Savchik, A;Bianco, S

2024

Abstract

Image quality assessment (IQA) is widely used to evaluate the results of image processing methods. While in recent years the development of objective IQA metrics has seen much progress, there are still many tasks where subjective IQA is significantly more preferred. Using subjective IQA has become even more attractive ever since crowdsourcing platforms such as Amazon Mechanical Turk and Toloka have become available. However, for some specific image processing tasks, there are still some questions related to subjective IQA that have not been solved in a satisfactory way. An example of such a task is the evaluation of image rendering styles where, unlike in the case of distortions, none of the evaluated styles is to be objectively regarded as a priori better or worse. The questions that have not been properly answered up until now are whether the scores for such a task obtained through crowdsourced subjective IQA are reliable and whether they remain stable, i.e., similar if the evaluation is repeated over time. To answer these questions, in this paper first several images and styles are selected and defined, they are then evaluated by using crowdsourced subjective IQA on the Toloka platform, and the obtained scores are numerically analyzed. Experimental results confirm the reliability and stability of the crowdsourced subjective IQA for the problem in question.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				slide + paper
			
	Parole chiave
	
				Aesthetics; Computational Aesthetics; Crowdsourcing; Image Quality; Pairwise Comparison; Stability; Statistics;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - February 27-29
			
	Anno del convegno
	
				2024
			
	Curatori della monografia
	
				Radeva, P; Furnari, A; Bouatouch, K; Sousa, AA
			
	Titolo degli atti
	
				Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
			
	ISBN del volume degli atti
	
				9789897586798
			
	Data di pubblicazione
	
				2024
			
	Numero del volume
	
				4
			
	Pagina iniziale
	
				365
			
	Pagina finale
	
				372
			
	DOI dell'intervento
	
				https://dx.doi.org/10.5220/0012462000003660
			
	URL alternativo
	
				https://www.scitepress.org/ProceedingsDetails.aspx?ID=MHRKKMhq/s8=&t=1
			
	Fulltext
	
				none
			
	Citazione
	
				Ershov, E., Panshin, A., Ermakov, I., Banić, N., Savchik, A., Bianco, S. (2024). Reliability and Stability of Mean Opinion Score for Image Aesthetic Quality Assessment Obtained Through Crowdsourcing. In Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (pp.365-372). Science and Technology Publications, Lda [10.5220/0012462000003660].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/469898

Citazioni

1

ND

Social impact