Giulia Bertea
Indoor Navigation with Vocal Assistant: Alexa vs low-power vocal assistant at the edge.
Rel. Marcello Chiaberge, Francesco Salvetti, Vittorio Mazzia. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2021
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (15MB) | Preview |
Abstract: |
Among the numerous human-machine interaction methods, vocal communication has become very popular in the latest years. When initially brought onto the market, vocal assistants were strictly integrated on portable devices; nevertheless, nowadays it is becoming clear that they can be a useful feature for service robotics. In particular, driving a robot vocally constitutes a more inclusive mean of communication, which guarantees a faster and more straightforward way of asserting a command. Indeed, this technology is beneficial since it allows to tackle the needs of some social groups such as the elderly, visually-impaired or physically-limited people. The main purpose of this thesis work is to analyze and compare two different approaches to vocal navigation, while developing and deploying both on a robotic platform for domestic environments. The first approach exploits Amazon Alexa and the AWS cloud system, to which it needs to connect. This aspect represents the greatest drawback of this approach, since it brings out many issues related to privacy and security; moreover, it requires constant internet service availability. A valuable alternative can be a low-power vocal assistant at the edge, which is therefore locally integrated on the robotic platform, that has been implemented by a team of researchers at PIC4SeR (PoliTo Interdepartmental Centre for Service Robotics). This vocal assistant is a compound of different machine learning models for speech recognition and processing. An algorithm for the navigation of the robotic platform is developed and integrated with both vocal assistants. The main functions implemented allow the robot to follow basic navigation instructions and steer towards predefined sets of coordinates, which identify rooms and goals in a hypothetical map. Furthermore, an analysis of the meaning extraction methods exploited by both approaches is presented. Regarding the low-power vocal assistant at the edge, a more powerful and precise module for the action classification, based on natural language processing algorithms, is proposed and integrated into the application. The described module exploits advanced machine learning techniques, such as transformers and Deep Attention Neural Networks, for encoding and classifying sentences into predefined categories of instructions. Finally, after extensively simulating in a virtual environment a service robot guided by the two vocal assistants, some real-world tests are run with the goal of highlighting the limitations and advantages of both approaches. The results obtained open up to various future implementations and show how service robotics can highly benefit from vocal assistants at the edge, especially in indoor environments for assisting elderly, visually-impaired or physically-limited people. |
---|---|
Relatori: | Marcello Chiaberge, Francesco Salvetti, Vittorio Mazzia |
Anno accademico: | 2021/22 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 102 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-25 - INGEGNERIA DELL'AUTOMAZIONE |
Aziende collaboratrici: | Politecnico di Torino - PIC4SER |
URI: | http://webthesis.biblio.polito.it/id/eprint/20555 |
Modifica (riservato agli operatori) |