Bicocca Open Archive

The huge amount of textual data on theWeb has grown in the last few years rapidly creating unique contents of massive dimensions that constitutes fertile ground for Sentiment Analysis. In particular, social networks represents an emerging challenging sector where the natural language expressions of people can be easily reported through short but meaningful text messages. This unprecedented contents of huge dimensions need to be efficiently and effectively analyzed to create actionable knowledge for decision making processes. A key information that can be grasped from social environments relates to the polarity of text messages, i. e. the sentiment (positive, negative or neutral) that the messages convey. However, most of the works regarding polarity classification usually consider text as unique information to infer sentiment, do not taking into account that social networks are actually networked environments. A representation of real world data where instances are considered as homogeneous, independent and identically distributed (i.i.d.) leads us to a substantial loss of information and to the introduction of a statistical bias. For this reason, the combination of content and relationships is a core task of the recent literature on Sentiment Analysis, where friendships are usually investigated to model the principle of homophily (a contact among similar people occurs at a higher rate than among dissimilar people). However, paired with the assumption of homophily, constructuralism explains how social relationships evolve via dynamic and continuous interactions as the knowledge and behavior that two actors share increase. Considering the similarity among users on the basis of constructuralism appears to be a much more powerful force than interpersonal influence within the friendship network. As first contribution, this Ph.D. thesis proposes Approval Network as a novel graph representation to jointly model homophily and constructuralism, which is intended to better represent the contagion on social networks. Starting from the classical state-of-the-art methodologies where only text is used to infer the polarity of social networks messages, this thesis presents novel Probabilistic Relational Models on user, document and aspect-level which integrate the structural information to improve classification performance. The integration is particularly useful when textual features do not provide sufficient or explicit information to infer sentiment (e. g., I agree!). The experimental investigations reveal that incorporating network information through approval relations can lead to statistically significant improvements over the performance of complex learning approaches based only on textual features.

(2015). Probabilistic Relational Models for Sentiment Analysis in Social Networks. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2015).

Probabilistic Relational Models for Sentiment Analysis in Social Networks

POZZI, FEDERICO ALBERTO

2015

Abstract

The huge amount of textual data on theWeb has grown in the last few years rapidly creating unique contents of massive dimensions that constitutes fertile ground for Sentiment Analysis. In particular, social networks represents an emerging challenging sector where the natural language expressions of people can be easily reported through short but meaningful text messages. This unprecedented contents of huge dimensions need to be efficiently and effectively analyzed to create actionable knowledge for decision making processes. A key information that can be grasped from social environments relates to the polarity of text messages, i. e. the sentiment (positive, negative or neutral) that the messages convey. However, most of the works regarding polarity classification usually consider text as unique information to infer sentiment, do not taking into account that social networks are actually networked environments. A representation of real world data where instances are considered as homogeneous, independent and identically distributed (i.i.d.) leads us to a substantial loss of information and to the introduction of a statistical bias. For this reason, the combination of content and relationships is a core task of the recent literature on Sentiment Analysis, where friendships are usually investigated to model the principle of homophily (a contact among similar people occurs at a higher rate than among dissimilar people). However, paired with the assumption of homophily, constructuralism explains how social relationships evolve via dynamic and continuous interactions as the knowledge and behavior that two actors share increase. Considering the similarity among users on the basis of constructuralism appears to be a much more powerful force than interpersonal influence within the friendship network. As first contribution, this Ph.D. thesis proposes Approval Network as a novel graph representation to jointly model homophily and constructuralism, which is intended to better represent the contagion on social networks. Starting from the classical state-of-the-art methodologies where only text is used to infer the polarity of social networks messages, this thesis presents novel Probabilistic Relational Models on user, document and aspect-level which integrate the structural information to improve classification performance. The integration is particularly useful when textual features do not provide sufficient or explicit information to infer sentiment (e. g., I agree!). The experimental investigations reveal that incorporating network information through approval relations can lead to statistically significant improvements over the performance of complex learning approaches based only on textual features.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tutor afferente a Bicocca
	
				TISATO, FRANCESCO
			
	Parole chiave
	
				Sentiment Analysis; Opinion Mining; Social Network Analysis; Social Networks; Probabilistic Relational Models
			
	Settori scientifico-disciplinari (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
			
	* Lingua del contenuto
	
				English
			
	* Data di discussione
	
				12-feb-2015
			
	Scuola di dottorato
	
				Scuola di dottorato di Scienze
			
	* Corso di dottorato
	
				INFORMATICA - 22R
			
	* Ciclo di dottorato
	
				27
			
	* Anno accademico di conseguimento titolo
	
				2013/2014
			
	Fulltext
	
				open
			
	Citazione
	
				(2015). Probabilistic Relational Models for Sentiment Analysis in Social Networks. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2015).
			
	Appare nelle tipologie:
	
				07 - Tesi di dottorato Bicocca post 2009

File in questo prodotto:

File	Dimensione	Formato
PhD_unimib_700928.pdf accesso aperto Descrizione: Tesi di dottorato Tipologia di allegato: Doctoral thesis Dimensione 9.48 MB Formato Adobe PDF Visualizza/Apri	9.48 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/65709

Citazioni

ND

ND

Social impact