Emotions, feeling states arising from physiological changes, and sentiments, subjective evaluations as positive, negative, or neutral, are central to human behavior, decision-making, and mental health. These affective states are shaped by naturalistic stimuli, such as movies, often outside conscious awareness. Understanding how external content evokes emotions and is reflected in brain dynamics is key to advancing affective neuroscience and identifying neural markers of conditions like stress, depression, and ADHD. Unlike task-based or resting-state paradigms, naturalistic approaches such as film viewing offer greater ecological validity by engaging distributed networks involved in perception, emotion, and social cognition. In this study, we analyze existing fMRI data from the Algonauts 2025 challenge, derived from the CNeuroMod project, where participants watched episodes of Friends while undergoing brain imaging during naturalistic stimulus presentations. We investigate how dialogue audio and subtitles contribute to valence-based emotional processing. Sentiment labels (positive, neutral, negative) were automatically generated: audio was analyzed using a Wav2Vec2.0 speech emotion model estimating arousal, valence, and dominance, while subtitle sentiment was derived using VADER and Flair. These features were aligned with fMRI acquisition to examine how auditory and linguistic components map onto brain networks involved in emotion and social cognition, including the amygdala, insula, orbitofrontal cortex, and superior temporal sulcus. Our findings show that acoustic and combined sentiment models better track cortical dynamics in salience and default mode networks, while subtitle-based models yield weaker, localized effects. This highlights the value of multimodal sentiment features for studying affective brain responses during naturalistic viewing.
Corsico, A., Rigamonti, G., Zini, S., Celona, L., Napoletano, P. (2026). Decoding Affective States from fMRI Using Automatically Labeled Multi-modal Movie Stimuli. In Image Analysis and Processing - ICIAP 2025 Workshops 23rd International Conference, Rome, Italy, September 15–19, 2025, Proceedings, Part I (pp.89-100). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-032-11317-7_8].
Decoding Affective States from fMRI Using Automatically Labeled Multi-modal Movie Stimuli
Rigamonti, Giorgia
;Zini, Simone;Celona, Luigi;Napoletano, Paolo
2026
Abstract
Emotions, feeling states arising from physiological changes, and sentiments, subjective evaluations as positive, negative, or neutral, are central to human behavior, decision-making, and mental health. These affective states are shaped by naturalistic stimuli, such as movies, often outside conscious awareness. Understanding how external content evokes emotions and is reflected in brain dynamics is key to advancing affective neuroscience and identifying neural markers of conditions like stress, depression, and ADHD. Unlike task-based or resting-state paradigms, naturalistic approaches such as film viewing offer greater ecological validity by engaging distributed networks involved in perception, emotion, and social cognition. In this study, we analyze existing fMRI data from the Algonauts 2025 challenge, derived from the CNeuroMod project, where participants watched episodes of Friends while undergoing brain imaging during naturalistic stimulus presentations. We investigate how dialogue audio and subtitles contribute to valence-based emotional processing. Sentiment labels (positive, neutral, negative) were automatically generated: audio was analyzed using a Wav2Vec2.0 speech emotion model estimating arousal, valence, and dominance, while subtitle sentiment was derived using VADER and Flair. These features were aligned with fMRI acquisition to examine how auditory and linguistic components map onto brain networks involved in emotion and social cognition, including the amygdala, insula, orbitofrontal cortex, and superior temporal sulcus. Our findings show that acoustic and combined sentiment models better track cortical dynamics in salience and default mode networks, while subtitle-based models yield weaker, localized effects. This highlights the value of multimodal sentiment features for studying affective brain responses during naturalistic viewing.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


