Bicocca Open Archive

In recent years, we have witnessed the proliferation of large amounts of online content generated directly by users with virtually no form of external control, leading to the possible spread of misinformation. The search for effective solutions to this problem is still ongoing, and covers different areas of application, from opinion spam to fake news detection. A more recently investigated scenario, despite the serious risks that incurring disinformation could entail, is that of the online dissemination of health information. Early approaches in this area focused primarily on user-based studies applied to Web page content. More recently, automated approaches have been developed for both Web pages and social media content, particularly with the advent of the COVID-19 pandemic. These approaches are primarily based on handcrafted features extracted from online content in association with Machine Learning. In this scenario, we focus on Web page content, where there is still room for research to study structural-, content- and context-based features to assess the credibility of Web pages. Therefore, this work aims to study the effectiveness of such features in association with a deep learning model, starting from an embedded representation of Web pages that has been recently proposed in the context of phishing Web page detection, i.e., Web2Vec.

Upadhyay, R., Pasi, G., Viviani, M. (2021). Health Misinformation Detection in Web Content: A Structural-, Content-based, and Context-aware Approach based on Web2Vec. In GoodIT '21: Proceedings of the Conference on Information Technology for Social Good (pp.19-24). Association for Computing Machinery, Inc [10.1145/3462203.3475898].

Health Misinformation Detection in Web Content: A Structural-, Content-based, and Context-aware Approach based on Web2Vec

Upadhyay, Rishabh;Pasi, Gabriella;Viviani, Marco

2021

Abstract

In recent years, we have witnessed the proliferation of large amounts of online content generated directly by users with virtually no form of external control, leading to the possible spread of misinformation. The search for effective solutions to this problem is still ongoing, and covers different areas of application, from opinion spam to fake news detection. A more recently investigated scenario, despite the serious risks that incurring disinformation could entail, is that of the online dissemination of health information. Early approaches in this area focused primarily on user-based studies applied to Web page content. More recently, automated approaches have been developed for both Web pages and social media content, particularly with the advent of the COVID-19 pandemic. These approaches are primarily based on handcrafted features extracted from online content in association with Machine Learning. In this scenario, we focus on Web page content, where there is still room for research to study structural-, content- and context-based features to assess the credibility of Web pages. Therefore, this work aims to study the effectiveness of such features in association with a deep learning model, starting from an embedded representation of Web pages that has been recently proposed in the context of phishing Web page detection, i.e., Web2Vec.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
			slide + paper
		
	Parole chiave
	
			Credibility; Deep Learning; Health Misinformation; Machine Learning; Social Web;
		
	Lingua del contenuto
	
			English
		
	Nome del convegno
	
			1st Conference on Information Technology for Social Good, GoodIT 2021 - 9 September 2021 through 11 September 2021
		
	Anno del convegno
	
			2021
		
	Titolo degli atti
	
			GoodIT '21: Proceedings of the Conference on Information Technology for Social Good
		
	ISBN del volume degli atti
	
			978-145038478-0
		
	Data di pubblicazione
	
			2021
		
	Pagina iniziale
	
			19
		
	Pagina finale
	
			24
		
	DOI dell'intervento
	
			https://dx.doi.org/10.1145/3462203.3475898
		
	Fulltext
	
			none
		
	Citazione
	
			Upadhyay, R., Pasi, G., Viviani, M. (2021). Health Misinformation Detection in Web Content: A Structural-, Content-based, and Context-aware Approach based on Web2Vec. In GoodIT '21: Proceedings of the Conference on Information Technology for Social Good (pp.19-24). Association for Computing Machinery, Inc [10.1145/3462203.3475898].
		
	Appare nelle tipologie:
	
			02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/326678

Citazioni

10

ND

Social impact