Environment and Embodiment adaptation of Vision-Language-Action models for robotic manipulation

Andrea Delli

Environment and Embodiment adaptation of Vision-Language-Action models for robotic manipulation.

Rel. Giuseppe Bruno Averta, Davide Buoso, Francesca Pistilli. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025

PDF (Tesi_di_laurea) - Tesi
Accesso limitato a: Solo utenti staff fino al 12 Giugno 2027 (data di embargo).
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (19MB)

Abstract

Vision-Language-Action (VLA) models represent a recent and promising direction in robotics, enabling agents to understand natural language instructions, perceive complex visual scenes, and perform manipulation tasks. However, these models often struggle to generalize across different robotic embodiments and environments, as changes in camera viewpoints, kinematics, or action spaces introduce significant distribution shifts. This thesis investigates the problem of robotic embodiment adaptation by evaluating the performance and adaptability of existing pre-trained VLA models on diverse robotic setups. The study focuses on fine-tuning and assessing multiple state-of-the-art VLA architectures: Diffusion Policy, OpenVLA, OpenVLA-OFT, SmolVLA, GR00T, and π0 using imitation learning. Data were collected primarily in simulation with the RLBench environment, which provides standardized tasks for the 7-DoF Franka Panda arm, and further validated on a 6-DoF real-world manipulator developed by the DIANA student team.

In total, approximately 500 simulated episodes and 50 real demonstrations were gathered

Relatori

Giuseppe Bruno Averta, Davide Buoso, Francesca Pistilli

Anno Accademico

2025/26

Tipo di pubblicazione

Elettronica

Numero di pagine

Corso di laurea

Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)

Classe di laurea

Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA

URI

https://webthesis.biblio.polito.it/id/eprint/38614

Modifica (riservato agli operatori)