Politecnico di Torino (logo)

Design of an FPGA-based accelerator for grape clusters detection

Alessandro Varaldi

Design of an FPGA-based accelerator for grape clusters detection.

Rel. Massimo Ruo Roch, Marco Vacca. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2023


This thesis focuses on the hardware implementation of an object detection system, in the context of a larger project to create an autonomous robot for grape harvesting equipped with advanced visual recognition capabilities. The primary objective of this work is to deploy a custom object detection algorithm capable of identifying bunches of grapes and to design a specialized hardware architecture capable of making it run fast and at low cost within the context of edge computing. The object detection task is usually entrusted to a general-purpose processor, which is simple to program but expensive and has improvable performance. To improve performance and decrease cost, this paper proposes a hardware acceleration Field Programmable Gate Array (FPGA) architecture for a custom YOLOv3-Tiny algorithm, which is a state-of-the-art, real-time object detection system based on a Convolutional Neural Network (CNN) architecture. Several optimization techniques are employed, such as: post-training 8-bit quantization, merging of Batch Normalization into the convolutional layer, and achieving 2D Convolution through General Matrix Multiplication (GeMM) performed by a systolic array. This thesis is divided into 3 parts. The first part first focuses on understanding what object detection models based on Convolutional Neural Networks are and how they work, and then describes in detail the process of creating and training a custom YOLOv3-Tiny model specialized in bunch of grapes recognition. The second part develops around the problems that exist in a direct implementation of the algorithm on FPGA and presents various optimization solutions, involving both the implemented algorithm and the architecture philosophy. The third and final part details the hardware implementation of the algorithm, achieved through a VHDL description, its simulation and the results achieved, discussing which platform may be suitable for a real implementation. In addition, limitations of the proposed design and possible improvements in future works are discussed.

Relators: Massimo Ruo Roch, Marco Vacca
Academic year: 2023/24
Publication type: Electronic
Number of Pages: 92
Additional Information: Tesi secretata. Fulltext non presente
Corso di laurea: Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica)
Classe di laurea: New organization > Master science > LM-25 - AUTOMATION ENGINEERING
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/28527
Modify record (reserved for operators) Modify record (reserved for operators)