Bicocca Open Archive

Perceiving image complexity is a crucial aspect of human visual understanding, yet explicitly assessing image complexity poses challenges. Historically, this aspect has been understudied due to its inherent subjectivity, stemming from its reliance on human perception, and the semantic dependency of image complexity in the face of diverse real-world images. Different computational models for image complexity estimation have been proposed in the literature. These models leverage a variety of techniques ranging from low-level, handcrafted features, to advanced machine learning algorithms. This paper explores the use of recent deep-learning approaches based on Visual Transformer to extract robust information for image complexity estimation in a transfer learning paradigm. Specifically, we propose to leverage three visual backbones, CLIP, DINO-v2, and ImageNetViT, as feature extractors, coupled with a Support Vector Regressor with Radial Basis Function kernel as an image complexity estimator. We test our approach on two widely used benchmark datasets (i.e. IC9600 and SAVOIAS) in an intra-dataset and inter-dataset workflow. Our experiments demonstrate the effectiveness of the CLIP-based features for accurate image complexity estimation with results comparable to end-to-end solutions.

Celona, L., Ciocca, G., Schettini, R. (2024). On the Use of Visual Transformer for Image Complexity Assessment. In Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (pp.640-647). Science and Technology Publications, Lda [10.5220/0012426500003660].

On the Use of Visual Transformer for Image Complexity Assessment

Celona, L;Ciocca, G;Schettini, R

2024

Abstract

Perceiving image complexity is a crucial aspect of human visual understanding, yet explicitly assessing image complexity poses challenges. Historically, this aspect has been understudied due to its inherent subjectivity, stemming from its reliance on human perception, and the semantic dependency of image complexity in the face of diverse real-world images. Different computational models for image complexity estimation have been proposed in the literature. These models leverage a variety of techniques ranging from low-level, handcrafted features, to advanced machine learning algorithms. This paper explores the use of recent deep-learning approaches based on Visual Transformer to extract robust information for image complexity estimation in a transfer learning paradigm. Specifically, we propose to leverage three visual backbones, CLIP, DINO-v2, and ImageNetViT, as feature extractors, coupled with a Support Vector Regressor with Radial Basis Function kernel as an image complexity estimator. We test our approach on two widely used benchmark datasets (i.e. IC9600 and SAVOIAS) in an intra-dataset and inter-dataset workflow. Our experiments demonstrate the effectiveness of the CLIP-based features for accurate image complexity estimation with results comparable to end-to-end solutions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				slide + paper
			
	Parole chiave
	
				Feature Extraction; Image Complexity; Self-Supervised; Supervised; Transfer Learning; Vision Transformers;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2024 - 27 February 2024 through 29 February 2024
			
	Anno del convegno
	
				2024
			
	Titolo degli atti
	
				Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
			
	ISBN del volume degli atti
	
				9789897586798
			
	Data di pubblicazione
	
				2024
			
	Numero del volume
	
				3
			
	Pagina iniziale
	
				640
			
	Pagina finale
	
				647
			
	DOI dell'intervento
	
				https://dx.doi.org/10.5220/0012426500003660
			
	Fulltext
	
				none
			
	Citazione
	
				Celona, L., Ciocca, G., Schettini, R. (2024). On the Use of Visual Transformer for Image Complexity Assessment. In Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (pp.640-647). Science and Technology Publications, Lda [10.5220/0012426500003660].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/464438

Citazioni

0

ND

Social impact