Politecnico di Torino (logo)

An emotional coherence analysis with a multimodal Neural Network. A study based on a novel affective database of facial expression.

Fabio Tatti

An emotional coherence analysis with a multimodal Neural Network. A study based on a novel affective database of facial expression.

Rel. Federica Marcolin, Luca Ulrich, Alberto Raposo, Daniel Mograbi. Politecnico di Torino, Corso di laurea magistrale in Data Science and Engineering, 2023


A new technology called Facial Emotion Recognition (FER) tries to replicate exactly the decoding phase of the emotion from the facial expression by exploiting the state of the art AI techniques. The technique needs, in order to be performed, to leverage on a dataset of facial expression previously annotated in such a way to have a ground truth. The dataset itself was built in a collaboration between the Politecnico di Torino and the Pontifıcia Universidade Catolica do Rio de Janeiro, in order to create a dataset of Brazilian faces. The data were collected by showing to volunteers 48 emotional images, in random order, selected from the IAPS and GAPED databases. Meanwhile a RGB + Depth sensor was filming their faces to capture the emotion. For each showed image was asked to the volunteer to do a self-evaluation about the experienced emotion, describing it with 3 parameters: valence, arousal ad the emotion itself by selecting it among 7 options. This was performed from March 2022 to July 2022. The study is based on the researches of Paul Elkman about basic emotions: he managed to identify 6 basic emotions (anger, disgust, fear, happiness, sadness, surprise) capable of describe the emotional spectrum of a human being. In our case also the neutral emotion was added. Then, in order to assess the valence and the arousal, Self-Assessment Manikin (SAM) was used. For each video, then, a frame was selected with the criterion of being the most representative in the emotional sense. Then, the 3D and 2D data are merged to finally been cropped in such a way to isolate just the face. The next step is to train the Neural Network: a 2D + 3D multimodal approach was exploited in order to create a robust classifier. This is based on a 2D Vision Transformer (ViT) encoder and a 3D Convolutional Neural Network (CNN), resulting to be able to process both point clouds and images. In such a way we are able to exploit the attention-mechanism of the transformer and concatenate it with the features extracted by means of the 3D Convolution. Given the dataset we investigate if during the test, the more images where showed the less the volunteer was stimulated by that. This comes from the intuition that the order of appearance of the images may affect the efficacy of the emotional stimuli. To address this task the appearance order of the image was added as a feature to then feed the neural network during the training and test phases. Then we do investigate the discrepancy between the self-evaluation of the volunteer and the outcome of the selection-of-the-frame phase for what concerns the perceived emotion. This can be formulated also as difference among inner-state and external-state. We perform this by partitioning the population first according to the gender (female and male) and then to nationality (Italian and Brazilian). From the result of this we want to investigate if we can strictly divide the two sides of the given partition given the outcome of the inner-state vs. external-state comparison. Also, this can help us to investigate the collectivistic rather then individualistic nature of the two cultures. The literature does not classify them with an unique characteristic, given the fact that both present aspects from both of the sides. Despite that, we leverage on the comparison in order to answer to the question: can we affirm that a given culture among Italian and Brazilian is more collectivistic than the other?

Relators: Federica Marcolin, Luca Ulrich, Alberto Raposo, Daniel Mograbi
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 123
Additional Information: Tesi secretata. Fulltext non presente
Corso di laurea: Corso di laurea magistrale in Data Science and Engineering
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Ente in cotutela: Pontifícia Universidade Católica do Rio de Janeiro (BRASILE)
Aziende collaboratrici: Pontifícia Universidade Católica do Rio de Janeiro (PUC-Rio)
URI: http://webthesis.biblio.polito.it/id/eprint/26856
Modify record (reserved for operators) Modify record (reserved for operators)