Politecnico di Torino (logo)

Representation of valence-arousal-dominance in the facial expression recognition application scenario

Michela Putzu

Representation of valence-arousal-dominance in the facial expression recognition application scenario.

Rel. Federica Marcolin, Francesco Ferrise. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Biomedica, 2021


In the study of emotions, there are different models for classifying emotion: categorical models, which describe them in terms of basic emotions and which are suitable for unimodal spaces, and dimensional models which classify emotions in terms of two or more dimensions. Several models have been developed over the years but only some of these are accepted by the scientific community. In order to provide a better representation of emotions, a new dimensional model was created, using Russell's circumplex model and the PAD emotional state model, developed by Albert Mehrabian and James A. Russell. Starting from an experimental database composed of 112 people, aged between 19 and 35 years, the experimentation involved the recording of the facial video (RGB and Depth) while viewing 48 images (which were selected from the IAPS and GAPED databases, database designed to arouse emotions) preceded by a training phase, consisting of the vision of 6 images. These images were selected with the aim of arousing the basic emotions identified by Ekman (anger, fear, disgust, sadness, happiness and surprise) to which neutrality was added. At the end of the vision of the image, each person evaluated, in terms of valence and arousal through the SAM (Self-Assessment Manikin) scale, the images seen and indicated the emotion felt. This questionnaire was useful to verify if the images chosen for the construction of the database aroused the desired emotion and how they were distributed in the affective space. Furthermore, to evaluate the effect of dominance, a dimension not present for this database, data relating to virtual environments (developed in a parallel study) were used. These responses were used for the analysis and construction of the new model, renamed "skewer model". At the same time, the MI-TO database was created, created through the following steps: 1.??visualization of the recorded video and saving of the most representative frame with relative label 2.??alignment of RGB and Depth frames 3.??frame cropping. Since the recognition of the facial expression is not unique, the choice of emotions and therefore of the most significant frame was taken by a focus group, which also included a psychologist, in order to make the choice as accurate as possible. The last phase of the work focused on training and testing the CNN neural network, which led to a high correct classification percentage (greater than 75%) with an imbalance towards the neutral class.

Relators: Federica Marcolin, Francesco Ferrise
Academic year: 2021/22
Publication type: Electronic
Number of Pages: 85
Additional Information: Tesi secretata. Fulltext non presente
Corso di laurea: Corso di laurea magistrale in Ingegneria Biomedica
Classe di laurea: New organization > Master science > LM-21 - BIOMEDICAL ENGINEERING
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/20155
Modify record (reserved for operators) Modify record (reserved for operators)