polito.it
Politecnico di Torino (logo)

Leap: a Model-Based Reinforcement Learning Framework for Fast Object Detection

Edoardo Roba

Leap: a Model-Based Reinforcement Learning Framework for Fast Object Detection.

Rel. Andrea Giuseppe Bottino. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2020

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (5MB) | Preview
Abstract:

The goal of the project was to create a new algorithm for Object Detection. The starting point of the project was exposed in "Active Object Localization with Deep Reinforcement Learning", where they described an OD algorithm based on Deep Reinforcement Learning and Markov Decision Process. Since every action taken by the Agent, the algorithm compares the current state with the environment, it takes a long time for the computations, because every state is encoded through a CNN, which is a Neural Network with over 58 million parameters (VGG16-like). The purpose of this project was to design a Model-Based algorithm to bypass the CNN. During the Q-training, many transitions (state-next_state-action) are recorded and are used as training dataset for the predictive NN. As a matter of fact, this predictive model is a fully connected layer network, which has 2 hidden layers and 9 million parameters, so it is much lighter than the CNN. The network is trained in such a way that the output is fed to the input, so the network is able to be trained on a sequence of data, which represent the sequence of states. On the other hand, an epsilon greedy policy is adapted, in order to avoid the algorithm gets stuck in a state during the detections. The results are the following: the more predictions the algorithm performs, the faster the algorithm goes (with no predictions, 1.88 seconds per image is the average speed, while we have 0.98 seconds per image if we performs 4 or 5 leaps). However, it is remarkable to say that the more leaps are performed, the less accuracy the algorithm obtains (an average of 4% of accuracy loss every leap the algorithm performs).

Relatori: Andrea Giuseppe Bottino
Anno accademico: 2019/20
Tipo di pubblicazione: Elettronica
Numero di pagine: 56
Soggetti:
Corso di laurea: Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-25 - INGEGNERIA DELL'AUTOMAZIONE
Ente in cotutela: UNIVERSITY OF ILLINOIS AT CHICAGO (STATI UNITI D'AMERICA)
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/15384
Modifica (riservato agli operatori) Modifica (riservato agli operatori)