polito.it
Politecnico di Torino (logo)

Explainability-Driven Deep Learning for Predicting Biological Invasiveness in Plant Species

Barbara Frittella

Explainability-Driven Deep Learning for Predicting Biological Invasiveness in Plant Species.

Rel. Daniele Apiletti, Simone Monaco. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025

[img] PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (19MB)
Abstract:

Invasive non-native species spread rapidly outside their natural range and can disrupt ecosystems, damage economies, and threaten health. Early identification is therefore critical, yet current ecological practice remains largely manual and existing deep learning pipelines provide little insight into the morphological traits that define invasiveness. This thesis addresses this gap by developing an explainability-driven deep learning pipeline that links morphological traits to a model's classification prediction of invasiveness and exposes why the model fails on specific images. The approach trains a classifier on image embeddings extracted with BioCLIP-2 and adopts an imageomics perspective, treating images as high-dimensional phenotypes. Saliency-guided region extraction (Integrated Gradients) identifies the image pixels most critical to the model's predictions. By clustering the embeddings of these regions and manually annotating them, we are able to define interpretable visual concepts. These clusters are then independently validated and propagated across the dataset to produce image-level concept labels. By analyzing the co-occurrence between labels and model outputs, the pipeline identifies which structures support correct invasive detections and which spurious cues drive misclassifications. This method is demonstrated on a dataset of Lythrum images as a case study, offering a scalable path toward more transparent and trustworthy deep learning systems in ecology. Future works could extend the pipeline beyond Lythrum to other taxa and ecological contexts, also integrating complementary data sources to link image-derived concepts more directly to measurable biological traits.

Relatori: Daniele Apiletti, Simone Monaco
Anno accademico: 2025/26
Tipo di pubblicazione: Elettronica
Numero di pagine: 101
Soggetti:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/37834
Modifica (riservato agli operatori) Modifica (riservato agli operatori)