polito.it
Politecnico di Torino (logo)

Instance Segmentation and Visual Servoing for apple harvesting

Adriana Di Terlizzi

Instance Segmentation and Visual Servoing for apple harvesting.

Rel. Marcello Chiaberge, Mauro Martini, Alessandro Navone, Marco Ambrosio. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2024

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (62MB) | Preview
Abstract:

In recent years, the rising global population and the growing demand for food, coupled with a shortage of human laborers, have placed immense pressure on the agricultural sector to revolutionize farming operations through advanced technology, to produce more food, more efficiently and sustainably. In this context, autonomous harvesting robots have emerged as a promising solution to these challenges. This thesis focuses on designing a visual servo for apple harvesting using the Kinova Gen3 Lite, a 6 Degrees of freedom (DoF) manipulator, equipped with an Intel RealSense D435i camera mounted on its end effector. The objective is to enable the robot to accurately identify the target apple and autonomously guide the robot tool toward it for successful grasping. To address the problems, the proposed solution comprises two main modules: one dedicated to visual perception and another focused on manipulator control. Instance Segmentation is selected as a precise and effective strategy to cope with target recognition and localization of individual apples. A YOLOv8 Convolutional Neural Network (CNN) has been trained and fine-tuned to perform this task. Meanwhile, an Image-Based Visual Servo (IBVS) is employed to control the robot's motion efficiently. A comprehensive and integrated pipeline has been developed to evaluate the individual performances of the two modules and their combined efficacy. The Kinova robot operates on a stationary basis with static targets. By real-time inferencing the data stream captured by the RGB-D sensor, the YOLOv8 CNN model detects and segments all apple instances. A selection policy is then applied to choose the target apple, whose visual features are exploited by the IBVS for feedback control. Through direct error computation in the image space, this closed-loop system continuously adjusts the camera motion, thus the end effector motion, navigating the robot tool to the desired pose. Upon proximity to the target, an open-loop approach is executed to facilitate apple grasping. The proposed architecture is fully implemented in Robot Operating System 2 (ROS2), with Python as the primary programming language. Extensive experiments were conducted using the real robot and mockup apples at PIC4SeR (PoliTo Interdepartmental Centre for Service Robotics). The results show effective apple instance segmentation and localization by the network, while the IBVS controller correctly drives the end effector toward the target.

Relatori: Marcello Chiaberge, Mauro Martini, Alessandro Navone, Marco Ambrosio
Anno accademico: 2024/25
Tipo di pubblicazione: Elettronica
Numero di pagine: 136
Soggetti:
Corso di laurea: Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-25 - INGEGNERIA DELL'AUTOMAZIONE
Aziende collaboratrici: Politecnico di Torino - PIC4SER
URI: http://webthesis.biblio.polito.it/id/eprint/33026
Modifica (riservato agli operatori) Modifica (riservato agli operatori)