
Hossein Zahedi Nezhad
Similarity of Waste Image for Smart Bins Using Deep Learning.
Rel. Bartolomeo Montrucchio, Antonio Costantino Marceddu. Politecnico di Torino, Corso di laurea magistrale in Ict For Smart Societies (Ict Per La Società Del Futuro), 2025
![]() |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (16MB) |
Abstract: |
This dissertation presents a deep learning-based framework for object-level change detection in cluttered visual scenes, focusing on identifying added or reconfigured items between temporally adjacent images. The motivation arises from real-world challenges in automated waste monitoring systems, where detecting changes in bin contents is critical for optimizing collection routes, improving recycling efficiency, and reducing operational costs. To address this need, two approaches were explored. Initially, a Siamese network trained with contrastive loss was implemented to evaluate the feasibility of image-level change detection based on pairwise similarity. While this approach demonstrated potential, it was limited to producing a single similarity score between images, without the ability to localize individual changes or determine the number of added objects. These limitations motivated the transition to a more expressive object-level triplet learning framework, where the network compares anchor–positive–negative tuples. This design enables fine-grained matching, robust feature discrimination, and estimation of added object count. In the triplet-based framework, the system combines polygon-guided object cropping, deep metric learning (i.e., training neural networks to learn a similarity-preserving embedding space), and cosine similarity analysis to compare object instances across image pairs. A COCO-style (Common Objects in Context) annotated dataset of 7,150 real-world waste bin images was used, capturing cluttered scenes with items such as cups, packaging, and organic waste. Objects were extracted using segmentation masks and polygon-fitting, then standardized into 224×224 crops. Four popular convolutional neural networks—ResNet-50, ResNet-101, MobileNetV2, and Xception—were repurposed from their original classification role to serve as backbone feature extractors. Each model was modified with a custom projection head that maps high-dimensional features into a 128-dimensional embedding space using global average pooling, dropout, a dense layer, and L2 normalization. This transformation enables angular similarity comparisons via cosine distance, making the architecture suitable for fine-grained object matching. A progressive training strategy was employed to enhance embedding quality and assess the impact of adaptation. In the zero-shot configuration, the backbone remained fully frozen to evaluate how well pre-trained features generalize to the object matching task. In Phase 1, only the projection head was trained while the backbone remained frozen, allowing rapid convergence and preserving general visual priors. In Phase 2, selective fine-tuning of deeper backbone layers was performed while continuing to train the projection head. Training relied on a margin-based triplet loss with a custom mining pipeline incorporating hard positive mining, hard and semi-hard negative mining, and stage-specific augmentations. Strong geometric and photometric transformations were applied to anchors and positives, while negatives were minimally perturbed. Model comparison and performance assessment across all architectures highlight the strengths and trade-offs of each model, confirming that the proposed approach enables accurate detection of added objects and reliable matching of existing instances across image pairs. The system is scalable, robust, and well-suited for integration into real-time waste monitoring systems. |
---|---|
Relatori: | Bartolomeo Montrucchio, Antonio Costantino Marceddu |
Anno accademico: | 2024/25 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 100 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Ict For Smart Societies (Ict Per La Società Del Futuro) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-27 - INGEGNERIA DELLE TELECOMUNICAZIONI |
Aziende collaboratrici: | NON SPECIFICATO |
URI: | http://webthesis.biblio.polito.it/id/eprint/36557 |
![]() |
Modifica (riservato agli operatori) |