polito.it
Politecnico di Torino (logo)

XAI tools to predict biological invasiveness: a case study in plants

Guido Spina

XAI tools to predict biological invasiveness: a case study in plants.

Rel. Daniele Apiletti, Simone Monaco. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025

[img] PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (19MB)
Abstract:

Invasive non-native species are plants, animals, fungi or microorganisms that have been introduced (intentionally or accidentally) into an area where they are not originally present, and can have a negative impact on environment, economy or health by spreading quickly and without control. Their identification is important because it allows humans to either eradicate them if the spreading process has already begun, or to avoid their import into a new area altogether. To this moment, there is no method to identify what morphological traits make a species of plants potentially invasive and what makes it non-invasive based exclusively on image data, relying instead on categorical or numerical traits that are not always available. In this work we propose a pipeline to identify, within a family of plants, which species have the potential to be invasive and which ones have not, using the "Lythrum" genus as a case study. To do this we employ BioCLIP 2, a computer vision foundation model specialized in the biological domain, as a feature extractor to train a classifier to recognize an invasive species or a non-invasive one. Then, using Integrated Gradients as an explainability method, we highlight what are the regions of the image that the classifier identifies as most useful for its prediction. By extracting these regions and clustering them we are able to analyze what morphological traits are taken into consideration by the classifier, making them possible candidates as features that allow a species to be invasive. Additionally, it is possible to understand which traits or features drag the model into misclassification. With this work we are able to provide a pipeline to better explore and explain predictions on image data in the biological domain. For future works that will take on this problem it might be interesting to extend the reach of the study by taking into consideration other family of plants, and to integrate the outcome of the pipeline into already existing analysis of species invasiveness that utilize different features as predictors.

Relatori: Daniele Apiletti, Simone Monaco
Anno accademico: 2025/26
Tipo di pubblicazione: Elettronica
Numero di pagine: 106
Soggetti:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/37875
Modifica (riservato agli operatori) Modifica (riservato agli operatori)