polito.it
Politecnico di Torino (logo)

Human-Robot Interaction for autonomous mobile robots using Large Language Models

Giovanni Bordero

Human-Robot Interaction for autonomous mobile robots using Large Language Models.

Rel. Alessandro Rizzo, Pangcheng David Cen Cheng. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2024

Abstract:

This thesis investigates the application of Large Language Models (LLMs) for robotic control and navigation using natural language instructions, focusing both on simulations and real world. The research is structured in three progressive phases, beginning with a 2D environment simulation, in this phase the the LLM calculated parameters like velocity, time, and angular velocity to move a mobile robot. A second approach based on functions that calculate the parameters themselves was also tried, to eliminate the overhead of calculations given to the model. The second phase introduced a 3D environment simulation utilizing the Gazebo simulator within the Robot Operating System (ROS) framework. This phase employed Simultaneous Localization and Mapping (SLAM) for map construction, with navigation managed by the ROS Navigation Stack using Behavior Trees. The LLM’s role evolved to designing comprehensive plans for task execution in these complex environments. The final phase involved a physical Turtlebot3 robot in a controlled laboratory setting, validating the systems and methodologies tested in simulations and addressing real-world challenges. Throughout the study, three large language models (LLMs) were evaluated based on task completion time, execution correctness, and results reproducibility trying to create a reliable system for safety-critical environments. Another key element of this research was addressing the challenges inherent in prompting LLMs for robotic control, emphasizing the importance of providing complete and accurate descriptions of both the problem and the environment, including all relevant constraints. In order to go beyond the concept of a single prompt, several architectures have been developed that interact multiple times with these LLMs, trying to maximise performance. The results revealed that while LLMs showed limited effectiveness in creating plans requiring low-level specifications,but they excelled in generating detailed, high- level movement plans. They performed well when intricate movement execution details were delegated to the navigation stack and demonstrated promising results in completing real-world inspired tasks. The study highlights the potential of LLMs for enhancing robotic autonomy movement while acknowledging their current limitations. It contributes to the broader understanding of LLM-robot interaction design and provides insights into the practical application of these models in robotic systems.

Relatori: Alessandro Rizzo, Pangcheng David Cen Cheng
Anno accademico: 2024/25
Tipo di pubblicazione: Elettronica
Numero di pagine: 87
Informazioni aggiuntive: Tesi secretata. Fulltext non presente
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/33968
Modifica (riservato agli operatori) Modifica (riservato agli operatori)