Alessandro De Marco
Real-World Fine-Tuning of Diffusion Policies for Autonomous Exploration Using Reinforcement Learning and Human Demonstrations.
Rel. Raffaello Camoriano, Luca Benini, Michele Magno. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025
Abstract
Autonomous exploration is a fundamental challenge in robotics, with broad implications for operations in remote or hazardous environments. Diffusion policies, generative models that can predict robot actions, have emerged as powerful tools for navigation. However, these models are typically trained with imitation learning (IL) and often fail to generalize beyond their demonstrations. Furthermore, fine-tuning diffusion policies with reinforcement learning (RL) is challenging, as backpropagating through the denoising chain is non trivial, and sample collection in the real world is costly. This thesis addresses such challenges by adapting Q-weighted Variational Policy Optimization (QVPO) to fine-tune Navigation with Goal Masked Diffusion (NoMaD), a state-of-the-art diffusion-based navigation model that unifies goal-conditioned navigation and task-agnostic exploration through goal masking, predicting multimodal action sequences directly from past RGB frames.
The fine-tuning is guided by an external critic that evaluates the sampled trajectories and reweights the diffusion loss according to their Q-values, enabling RL-based fine-tuning without traversing the denoising process
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Informazioni aggiuntive
Corso di laurea
Classe di laurea
Ente in cotutela
Aziende collaboratrici
URI
![]() |
Modifica (riservato agli operatori) |
