Politecnico di Torino (logo)

Scene Graph Generation in Autonomous Driving: a Neuro-symbolic approach

Paolo Emmanuel Ilario Dimasi

Scene Graph Generation in Autonomous Driving: a Neuro-symbolic approach.

Rel. Lia Morra, Fabrizio Lamberti. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2023

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (25MB) | Preview

The 2022 study on traffic fatalities in Italy by the Italian National Institute of Statistics (ISTAT) reports 454 daily fatalities and 561 injuries, primarily due to distractions. Then, the success of Autonomous Driving depends on intelligent perception systems to enhance road safety, with vision systems playing a critical role throughout its history. In the field of Computer Vision, Deep Learning has gained mainstream acceptance for its ability to model complex problems like Object Detection and Instance Segmentation. More recently, Scene Graph Generation has emerged as a novel paradigm, where scenes are depicted as graphs with objects as nodes and their relationships as edges. This area has seen substantial research, but only a limited fraction of it pertains to autonomous driving applications and most of it focuses on specific traffic scenarios, limiting diversity. A comprehensive effort for incorporating all relevant objects in traffic scenarios resulted in the creation of the Traffic Genome dataset. However, it suffers from bias due to uneven relationship frequency, which can lead to the misclassification of rare events. This thesis addresses this issue by infusing prior knowledge into scene graph generation using neuro-symbolic approaches. Relational Transformer network is used as baseline, due to its state-of-arts results in the one-stage approaches. Two methodologies for knowledge injection are adopted. The first involves external knowledge injection through Knowledge Graph Embedding(KGE) techniques using PandaSet, an autonomous driving dataset released by Hesai and Scale AI, as the foundational knowledge base due to its rich multi-modal information. The second approach utilizes the Logic Tensor Network(LTN) for constraint satisfaction, employing axioms as constraints during training. Results indicate that both methods improve performance, with the choice depending on the trade-off between deployment speed and accuracy: KGE methods are faster to develop but limited by available relationships in the knowledge base, while LTN potentially can outperform them but requires more time to design optimal axioms based on domain expertise.

Relators: Lia Morra, Fabrizio Lamberti
Academic year: 2023/24
Publication type: Electronic
Number of Pages: 92
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/29354
Modify record (reserved for operators) Modify record (reserved for operators)