Bicocca Open Archive

Nowadays, review sites are more and more confronted with the spread of misinformation, i.e., opinion spam, which aims at promoting or damaging some target businesses, by misleading either human readers, or automated opinion mining and sentiment analysis systems. For this reason, in the last years, several data-driven approaches have been proposed to assess the credibility of user-generated content diffused through social media in the form of on-line reviews. Distinct approaches often consider different subsets of characteristics, i.e., features, connected to both reviews and reviewers, as well as to the network structure linking distinct entities on the review-site in exam. This article aims at providing an analysis of the main review- and reviewer-centric features that have been proposed up to now in the literature to detect fake reviews, in particular from those approaches that employ supervised machine learning techniques. These solutions provide in general better results with respect to purely unsupervised approaches, which are often based on graph-based methods that consider relational ties in review sites. Furthermore, this work proposes and evaluates some additional new features that can be suitable to classify genuine and fake reviews. For this purpose, a supervised classifier based on Random Forests have been implemented, by considering both well-known and new features, and a large-scale labeled dataset from which all these features have been extracted. The good results obtained show the effectiveness of new features to detect in particular singleton fake reviews, and in general the utility of this study.

Fontanarava, J., Pasi, G., Viviani, M. (2017). Feature analysis for fake review detection through supervised classification. In Data Science and Advanced Analytics (DSAA), 2017 IEEE International Conference on (pp.658-666). Institute of Electrical and Electronics Engineers Inc. [10.1109/DSAA.2017.51].

Feature analysis for fake review detection through supervised classification

Fontanarava, J;Pasi, G;Viviani, M

2017

Abstract

Nowadays, review sites are more and more confronted with the spread of misinformation, i.e., opinion spam, which aims at promoting or damaging some target businesses, by misleading either human readers, or automated opinion mining and sentiment analysis systems. For this reason, in the last years, several data-driven approaches have been proposed to assess the credibility of user-generated content diffused through social media in the form of on-line reviews. Distinct approaches often consider different subsets of characteristics, i.e., features, connected to both reviews and reviewers, as well as to the network structure linking distinct entities on the review-site in exam. This article aims at providing an analysis of the main review- and reviewer-centric features that have been proposed up to now in the literature to detect fake reviews, in particular from those approaches that employ supervised machine learning techniques. These solutions provide in general better results with respect to purely unsupervised approaches, which are often based on graph-based methods that consider relational ties in review sites. Furthermore, this work proposes and evaluates some additional new features that can be suitable to classify genuine and fake reviews. For this purpose, a supervised classifier based on Random Forests have been implemented, by considering both well-known and new features, and a large-scale labeled dataset from which all these features have been extracted. The good results obtained show the effectiveness of new features to detect in particular singleton fake reviews, and in general the utility of this study.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				slide + paper
			
	Parole chiave
	
				Feature extraction, Social network services, Pragmatics, Media, Electronic mail, Business, User-generated content
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA)
			
	Anno del convegno
	
				2017
			
	Titolo degli atti
	
				Data Science and Advanced Analytics (DSAA), 2017 IEEE International Conference on
			
	ISBN del volume degli atti
	
				9781509050048
			
	Data di pubblicazione
	
				2017
			
	Numero del volume
	
				2018-
			
	Pagina iniziale
	
				658
			
	Pagina finale
	
				666
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1109/DSAA.2017.51
			
	Fulltext
	
				none
			
	Citazione
	
				Fontanarava, J., Pasi, G., Viviani, M. (2017). Feature analysis for fake review detection through supervised classification. In Data Science and Advanced Analytics (DSAA), 2017 IEEE International Conference on (pp.658-666). Institute of Electrical and Electronics Engineers Inc. [10.1109/DSAA.2017.51].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/187620

Citazioni

66

33

Social impact