
Gabriele Spagnuolo
One-Stage Depth Enhancement: Combining Depth Super-Resolution and Depth Completion.
Rel. Alessandro Rizzo, Enrico Civitelli. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2025
![]() |
PDF (Tesi_di_laurea)
- Tesi
Accesso riservato a: Solo utenti staff fino al 11 Ottobre 2026 (data di embargo). Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (18MB) |
Abstract: |
Depth information plays an important role in many modern computer vision applications, including industrial robotics, autonomous driving and 3D scene understanding and reconstruction. However, most of low-end depth sensors commonly used today are not able to meet effectively the requirements of algorithms in these downstream applications, especially in terms of spatial resolution. Moreover, depth images provided by such sensors often show a significant amount of noise and information loss, making depth enhancement an interesting and active research area. Noise in depth images becomes particularly problematic in applications that demand high precision, such as industrial robotics, where accuracy on the millimeter scale is often crucial. Additionally, low-resolution images suffer from insufficient pixel density per square millimeter, resulting in limited information for the system to utilize in completing its task effectively. High-lighting this range of improvement, in this work we primarily focus on two key tasks: depth super-resolution, which involves enhancing the spatial resolution of a depth map, and depth completion, which addresses the challenge of recovering missing information that is not captured by depth sensors. After reviewing state-of-the-art academic publications on this topic, the SGNet model architecture was selected for further investigation and improvement. Testing various training settings and design strategies, we aimed to address not only the super-resolution task, for which the network was originally designed, but also the reconstruction of missing areas in the depth map. Through the smart application of loss functions during the training phase and by integrating functionalities from one of the latest transformer-based architectures for monocular depth estimation, DepthAnythingV2, we achieved satisfactory results. The experiments were conducted on the NYUv2 benchmark dataset, widely used in depth-related computer vision applications, on the MetaGraspNet dataset, suited for industrial robotics applications, and finally on a real-world application setting, where images were taken from actual industrial manipulators in Comau S.p.a. The results were both qualitatively and quantitatively accurate, achieving performance levels comparable to the state-of-the-art. |
---|---|
Relatori: | Alessandro Rizzo, Enrico Civitelli |
Anno accademico: | 2024/25 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 107 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-25 - INGEGNERIA DELL'AUTOMAZIONE |
Aziende collaboratrici: | COMAU SPA |
URI: | http://webthesis.biblio.polito.it/id/eprint/35399 |
![]() |
Modifica (riservato agli operatori) |