Towards Context based Monocular Depth Estimation

Alessio Cappellato

Towards Context based Monocular Depth Estimation.

Rel. Barbara Caputo, Nicola Gatti, Sabine Süsstrunk. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2021

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (28MB) | Preview

Abstract

Monocular depth estimation is a classical computer vision task, which consists in densely predicting the spatial distance between the object depicted by each pixel and the camera with which the single RGB image is taken. This type of information is extremely useful for a variety of practical contexts, like 3D reconstruction, visual simultaneous localization and mapping (SLAM), and autonomous driving systems, because it permits reasoning about the geometrical structure of the environment and the relationship between objects in it. Over the last years, a number of fully-convolutional encoder-decoder networks have been used to study the considered problem; their popularity is rooted in their locality and translation invariance properties, which allow a parameter-efficient modelling of highly spatially-correlated information.

In this context, in the first part of this work, we improve the design of a convolutional decoder incorporating the Laplacian pyramid decomposition of the input image to guide the progressive prediction of depth residuals; this additional feature provided to the decoder retains important information on the location of object boundaries, but also uninformative noise due to intra-object variations

Relatori

Barbara Caputo, Nicola Gatti, Sabine Süsstrunk

Tipo di pubblicazione

Elettronica

URI

https://webthesis.biblio.polito.it/id/eprint/20541

Modifica (riservato agli operatori)