The huge amount of textual data on theWeb has grown in the last few years rapidly creating unique contents of massive dimensions that constitutes fertile ground for Sentiment Analysis. In particular, social networks represents an emerging challenging sector where the natural language expressions of people can be easily reported through short but meaningful text messages. This unprecedented contents of huge dimensions need to be efficiently and effectively analyzed to create actionable knowledge for decision making processes. A key information that can be grasped from social environments relates to the polarity of text messages, i. e. the sentiment (positive, negative or neutral) that the messages convey. However, most of the works regarding polarity classification usually consider text as unique information to infer sentiment, do not taking into account that social networks are actually networked environments. A representation of real world data where instances are considered as homogeneous, independent and identically distributed (i.i.d.) leads us to a substantial loss of information and to the introduction of a statistical bias. For this reason, the combination of content and relationships is a core task of the recent literature on Sentiment Analysis, where friendships are usually investigated to model the principle of homophily (a contact among similar people occurs at a higher rate than among dissimilar people). However, paired with the assumption of homophily, constructuralism explains how social relationships evolve via dynamic and continuous interactions as the knowledge and behavior that two actors share increase. Considering the similarity among users on the basis of constructuralism appears to be a much more powerful force than interpersonal influence within the friendship network. As first contribution, this Ph.D. thesis proposes Approval Network as a novel graph representation to jointly model homophily and constructuralism, which is intended to better represent the contagion on social networks. Starting from the classical state-of-the-art methodologies where only text is used to infer the polarity of social networks messages, this thesis presents novel Probabilistic Relational Models on user, document and aspect-level which integrate the structural information to improve classification performance. The integration is particularly useful when textual features do not provide sufficient or explicit information to infer sentiment (e. g., I agree!). The experimental investigations reveal that incorporating network information through approval relations can lead to statistically significant improvements over the performance of complex learning approaches based only on textual features.
(2015). Probabilistic Relational Models for Sentiment Analysis in Social Networks. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2015).
|Data di pubblicazione:||12-feb-2015|
|Titolo:||Probabilistic Relational Models for Sentiment Analysis in Social Networks|
|Settore Scientifico Disciplinare:||INF/01 - INFORMATICA|
|Scuola di dottorato:||Scuola di dottorato di Scienze|
|Corso di dottorato:||INFORMATICA - 22R|
|Citazione:||(2015). Probabilistic Relational Models for Sentiment Analysis in Social Networks. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2015).|
|Parole Chiave (Inglese):||Sentiment Analysis; Opinion Mining; Social Network Analysis; Social Networks; Probabilistic Relational Models|
|Appare nelle tipologie:||07 - Tesi di dottorato Bicocca post 2009|