polito.it
Politecnico di Torino (logo)

LIS2SPEECH LIS translation in written text and spoken language

Giuseppe Mercurio

LIS2SPEECH LIS translation in written text and spoken language.

Rel. Maurizio Morisio. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2021

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (12MB) | Preview
Abstract:

Deaf and hard-of-hearing people can communicate with each other using Sign Languages, but they may have difficulties in connecting with the rest of society. Sign Language Recognition is a field of study that started to be analysed back in 1983, but only in the last decade this task gained more attention. Most of the published works are related to American, Chinese and German sign languages. On the other hand, the number of studies on the Italian Sign Language (LIS) is still scarce. Thus, this work aims to offer a novel mechanism to translate isolated LIS signs into Italian written text and speech. In order to solve the expressed problem, Neural Networks, Deep Learning and Computer Vision have been exploited to create an application, called LIS2Speech (LIS2S), capable of returning the Italian translation of a LIS sign, performed within a recorded video. The method relies on hands, body and face skeletal features extracted from RGB videos without the need for any additional equipment, such as colour gloves. Since the goal is to embrace as many people as possible, LIS2S has been developed as a Progressive Web App, which is able to be run on any device, being it a computer or a smartphone, equipped with a camera. The results obtained with the described approach are in line with those obtained by automatic tools that have been developed for other sign languages, allowing the model to correctly understand and discriminate between signs belonging to a vocabulary of 50 words, which is in accord with the size of other corpora for isolated sign language recognition. In addition, a new dataset for Continuous Sign Language Recognition (CSLR) has been created and is being constantly expanded, in order to create a publicly available benchmark for this kind of task. In the end, although the conducted experiments yielded promising results, this work has just scratched the surface of the problem. In fact, the need for a corpus able to tackle CSLR tasks has emerged, since the proposed solution can translate only a single sign at a time. Other future works may examine the possibility of performing sentence segmentation, so that the obtained isolated signs can be translated by the actual model; moreover, to produce a very useful application for real-life purposes, it is necessary to convert the present prototypes into real-time instruments. Finally, another improvement concerns the extension of the number of signs the proposed design can translate, to enlarge the application fields of LIS2S.

Relatori: Maurizio Morisio
Anno accademico: 2020/21
Tipo di pubblicazione: Elettronica
Numero di pagine: 83
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Orbyta Tech srl.
URI: http://webthesis.biblio.polito.it/id/eprint/18096
Modifica (riservato agli operatori) Modifica (riservato agli operatori)