Bicocca Open Archive

Research aimed at finding solutions to the problem of the diffusion of distinct forms of non-genuine information online across multiple domains has attracted growing interest in recent years, from opinion spam to fake news detection. Currently, partly due to the COVID-19 virus outbreak and the subsequent proliferation of unfounded claims and highly biased content, attention has focused on developing solutions that can automatically assess the genuineness of health information. Most of these approaches, applied both to Web pages and social media content, rely primarily on the use of handcrafted features in conjunction with Machine Learning. In this article, instead, we propose a health misinformation detection model that exploits as features the embedded representations of some structural and content characteristics of Web pages, which are obtained using an embedding model pre-trained on medical data. Such features are employed within a deep learning classification model, which categorizes genuine health information versus health misinformation. The purpose of this article is therefore to evaluate the effectiveness of the proposed model, namely Vec4Cred, with respect to the problem considered. This model represents an evolution of a previous one, with respect to which new features and architectural choices have been considered and illustrated in this work.

Upadhyay, R., Pasi, G., Viviani, M. (2023). Vec4Cred: a model for health misinformation detection in web pages. MULTIMEDIA TOOLS AND APPLICATIONS, 82(4), 5271-5290 [10.1007/s11042-022-13368-z].

Vec4Cred: a model for health misinformation detection in web pages

Upadhyay R.^Primo;Pasi G.^Secondo;Viviani M.^Ultimo

2023

Abstract

Research aimed at finding solutions to the problem of the diffusion of distinct forms of non-genuine information online across multiple domains has attracted growing interest in recent years, from opinion spam to fake news detection. Currently, partly due to the COVID-19 virus outbreak and the subsequent proliferation of unfounded claims and highly biased content, attention has focused on developing solutions that can automatically assess the genuineness of health information. Most of these approaches, applied both to Web pages and social media content, rely primarily on the use of handcrafted features in conjunction with Machine Learning. In this article, instead, we propose a health misinformation detection model that exploits as features the embedded representations of some structural and content characteristics of Web pages, which are obtained using an embedding model pre-trained on medical data. Such features are employed within a deep learning classification model, which categorizes genuine health information versus health misinformation. The purpose of this article is therefore to evaluate the effectiveness of the proposed model, namely Vec4Cred, with respect to the problem considered. This model represents an evolution of a previous one, with respect to which new features and architectural choices have been considered and illustrated in this work.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Consumer health; Deep learning; Health misinformation; Machine learning; Natural language processing;
			
	Lingua del contenuto
	
				English
			
	Data ahead of print o Data prima pubblicazione Online
	
				28-lug-2022
			
	Data di pubblicazione
	
				2023
			
	Rivista
	
				MULTIMEDIA TOOLS AND APPLICATIONS
			
	Numero del volume
	
				82
			
	Fascicolo
	
				4
			
	Pagina iniziale
	
				5271
			
	Pagina finale
	
				5290
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1007/s11042-022-13368-z
			
	Fulltext
	
				open
			
	Citazione
	
				Upadhyay, R., Pasi, G., Viviani, M. (2023). Vec4Cred: a model for health misinformation detection in web pages. MULTIMEDIA TOOLS AND APPLICATIONS, 82(4), 5271-5290 [10.1007/s11042-022-13368-z].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
10281-390490_VoR.pdf accesso aperto Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 850.66 kB Formato Adobe PDF Visualizza/Apri	850.66 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/390490

Citazioni

29

23

Social impact