Politecnico di Torino (logo)

People counting using detection networks and self calibrating cameras on edge computing

Simone Luetto

People counting using detection networks and self calibrating cameras on edge computing.

Rel. Francesco Vaccarino. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Matematica, 2019

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (19MB) | Preview

The deep learning research is more and more focused on the development of specific application that are both effective and efficient, together with the software progress also the hardware research is starting to aim at building embedded devices able to run deep neural networks without using big servers. In this perspective this thesis, developed in Addfor, is aimed at building a software for people counting on an embedded device. The motivation for this choice is to obtain an autonomous and versatile software avoiding all the privacy issues. Regarding the hardware choice, two main edge devices has been considered, the Google Coral dev board and the NVIDIA Jetson nano, they are compared in terms of inference speed and flexibility using a lot of different neural networks. The results suggested the choice of the Jetson nano that is more powerful when it comes to perform inference with complex networks. In order to deploy neural networks on this device they also have to be optimized, so a specific optimization procedure has been carried out using the TensorRT library. The core of the project is the definition of a model that is able to detect and count a great number of people from a single high resolution image. The model is based on a convolutional neural network built to perform object detection, this means that the network is able to detect and localize multiple instances of specific objects in an image. Than it has been trained using a case-specific private dataset, provided by Addfor, consisting in indoor images of classrooms with an high average number of people. One of the main issues that has to be faced regards the number of people, it is way higher than the number of instances that a neural network is usually trained to detect, for this reason has been developed an algorithm than partitions the original images and reconstructs the final result merging the detection of partitioned images and eliminating the duplicates. To improve the effectiveness and efficiency of the project three computer vision problem are addressed: The computation of the vanishing point, that provides useful information on the room geometry enabling the use of contraints on the dimensions of the detected people. The computation of the optical flow, that is a technique used to find the zones where there is movement, in this way the useless part of the image are eliminated. The matching of multiple cameras, it is useful in case of rooms that needs the use of more than one camera, it aims at eliminating the part of the image that are seen from the other camera to improve the efficiency and avoid duplicates. The resulting software is able to detect people with a good accuracy of over 80% and considering only the people number it reaches an accuracy of more than 95%, the software is also able to perform an inference in less than 15 seconds and this value is lowered up to 10 seconds using the information provided by the optical flow and the matching.

Relators: Francesco Vaccarino
Academic year: 2019/20
Publication type: Electronic
Number of Pages: 114
Corso di laurea: Corso di laurea magistrale in Ingegneria Matematica
Classe di laurea: New organization > Master science > LM-44 - MATHEMATICAL MODELLING FOR ENGINEERING
Aziende collaboratrici: ADDFOR S.p.A
URI: http://webthesis.biblio.polito.it/id/eprint/12732
Modify record (reserved for operators) Modify record (reserved for operators)