Fully end-to-end deep learning policies for vision-based autonomous racing on ultra-low-power nano-drones

Florentin-Cristian Udrea

Fully end-to-end deep learning policies for vision-based autonomous racing on ultra-low-power nano-drones.

Rel. Daniele Jahier Pagliari, Alessio Burrello. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2024

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (10MB) | Preview

Abstract:	In recent years, drones have found applications in a variety of domains, such as aerial surveillance and rescue missions to precision agriculture and filmmaking. Advancements in drone miniaturization led to nano-drones: very compact drones with only 10 cm in diameter and a few tens of grams in weight, which have advantages, such as being highly maneuverable in confined areas and safe to operate around people, but also have limitations, such as battery lifetime lasting only a few minutes and a microcontroller unit (MCU) limited to under 100 mW of power, which restricts computational capacity. Over the past decade, autonomous drone racing (ADR) has become increasingly popular. In ADR competitions drones must autonomously navigate through gates and avoid obstacles at high speeds, which needs reactive perception and precise control. Because of these requirements ADR competitions have become a proxy to improve autonomous drones' navigation capabilities, encouraging advancements in onboard perception and control algorithms. More recently, the research community started to put a lot of emphasis on fully end-to-end autonomous systems for ADR, in which a single deep learning system processes the sensor inputs and outputs the motor commands. While end-to-end policies became state-of-the-art for bigger drone systems, they have not been employed on nano-drones due to their computational constraints. This thesis focuses on the Crazyflie 2.1, a nano-drone which, equipped with an ultra-low-power monochrome camera, state-estimation sensors, and a GAP8 MCU, can execute deep learning tasks onboard. The objective of the work is to develop a fully end-to-end vision-based deep learning policy in simulation, which results able to compete in ADR competitions, targeting the deployment on nano-drones. The proposed method consists of multiple steps. First we use a learning-by-cheating framework, in which a priviledged information teacher policy is used to teach a vision-based student policy. While the teacher policy can be trained directly with reinforcement learning (RL) due to its simpler state-based input space, the vision-based student policy has to be trained via imitation learning (IL), using the teacher policy as a dataset collector. Second, we leverage the dataset created in the previous step to apply neural architecture search (NAS) techniques in order to reduce the policy's computational cost and ensure its deployability. Finally, an RL fine-tuning through the asymmetric actor-critc framework is employed by using the pretrained post-NAS actor network and the teacher's pretrained critic network. This last step is essential to achieve a highly performing and stable vision-based student policy. In order to reduce the visual sim-to-real gap and to ensure deployability in unseen environments, multiple domain generalization techniques, such as visual domain randomization and pencil-filtering, were used during the training stages. As a result our method yields a vision-based policy which is able to compete in an unknown ADR track finishing it successfully up to 70% of the time. Thanks to NAS, our final policy achieves a 0.94 R2 score on the collected dataset while requiring only 15MMAC, a 12x reduction with minimal performance loss w.r.t our pre-NAS seed model (0.95 R2 score @ 183MMAC). Thus the policy is deployable at 30Hz entirely onboard the GAP8 SoC, allowing it to process image frames in real-time at the native camera frame rate.
Relatori:	Daniele Jahier Pagliari, Alessio Burrello
Anno accademico:	2024/25
Tipo di pubblicazione:	Elettronica
Numero di pagine:	125
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Data Science And Engineering
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Ente in cotutela:	IDSIA (SVIZZERA)
Aziende collaboratrici:	IDSIA USI-SUPSI
URI:	http://webthesis.biblio.polito.it/id/eprint/33877

Modifica (riservato agli operatori)