Bicocca Open Archive

Objective: We aimed to develop a machine learning model to infer OCEAN traits from text. Background: The psycholexical approach allows retrieving information about personality traits from human language. However, it has rarely been applied because of methodological and practical issues that current computational advancements could overcome. Method: Classical taxonomies and a large Yelp corpus were leveraged to learn an embedding for each personality trait. These embeddings were used to train a feedforward neural network for predicting trait values. Their generalization performances have been evaluated through two external validation studies involving experts (N = 11) and laypeople (N = 100) in a discrimination task about the best markers of each trait and polarity. Results: Intrinsic validation of the model yielded excellent results, with R2 values greater than 0.78. The validation studies showed a high proportion of matches between participants' choices and model predictions, confirming its efficacy in identifying new terms related to the OCEAN traits. The best performance was observed for agreeableness and extraversion, especially for their positive polarities. The model was less efficient in identifying the negative polarity of openness and conscientiousness. Conclusions: This innovative methodology can be considered a “psycholexical approach 2.0,” contributing to research in personality and its practical applications in many fields.

Giannini, F., Marelli, M., Stella, F., Monzani, D., Pancani, L. (2024). Surfing the OCEAN: The machine learning psycholexical approach 2.0 to detect personality traits in texts. JOURNAL OF PERSONALITY, 92(6 (December 2024)), 1602-1615 [10.1111/jopy.12915].

Surfing the OCEAN: The machine learning psycholexical approach 2.0 to detect personality traits in texts

Giannini, F;Marelli, M;Stella, F;Monzani, D;Pancani, L

2024

Abstract

Objective: We aimed to develop a machine learning model to infer OCEAN traits from text. Background: The psycholexical approach allows retrieving information about personality traits from human language. However, it has rarely been applied because of methodological and practical issues that current computational advancements could overcome. Method: Classical taxonomies and a large Yelp corpus were leveraged to learn an embedding for each personality trait. These embeddings were used to train a feedforward neural network for predicting trait values. Their generalization performances have been evaluated through two external validation studies involving experts (N = 11) and laypeople (N = 100) in a discrimination task about the best markers of each trait and polarity. Results: Intrinsic validation of the model yielded excellent results, with R2 values greater than 0.78. The validation studies showed a high proportion of matches between participants' choices and model predictions, confirming its efficacy in identifying new terms related to the OCEAN traits. The best performance was observed for agreeableness and extraversion, especially for their positive polarities. The model was less efficient in identifying the negative polarity of openness and conscientiousness. Conclusions: This innovative methodology can be considered a “psycholexical approach 2.0,” contributing to research in personality and its practical applications in many fields.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Big Five; natural language processing; psycholexical approach; word embedding;
			
	Lingua del contenuto
	
				English
			
	Data ahead of print o Data prima pubblicazione Online
	
				13-gen-2024
			
	Data di pubblicazione
	
				2024
			
	Rivista
	
				JOURNAL OF PERSONALITY
			
	Numero del volume
	
				92
			
	Fascicolo
	
				6 (December 2024)
			
	Pagina iniziale
	
				1602
			
	Pagina finale
	
				1615
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1111/jopy.12915
			
	Fulltext
	
				reserved
			
	Citazione
	
				Giannini, F., Marelli, M., Stella, F., Monzani, D., Pancani, L. (2024). Surfing the OCEAN: The machine learning psycholexical approach 2.0 to detect personality traits in texts. JOURNAL OF PERSONALITY, 92(6 (December 2024)), 1602-1615 [10.1111/jopy.12915].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Giannini-2024-Journal of Personality-VoR.pdf Solo gestori archivio Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Tutti i diritti riservati Dimensione 242.68 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	242.68 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/465118

Citazioni

1

1

Social impact