Leonardo Sgroi
Integrating Multi-Modal Reasoning and Explainable AI for Dermatological Image Analysis via LLM-Orchestrated Toolchains.
Rel. Flavio Giobergia, Ignazio Gallo. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (3MB) | Preview |
Abstract
Skin cancer is one of the most common and dangerous kinds of cancer worldwide, although its detection remains a challenge even for expert dermatologists. This thesis explores how artificial intelligence can be a trusted assistant in the diagnostic process by combining the reasoning power of Large Language Models (LLMs) with the precision of state-of-the-art vision tools. The proposed framework is a modular agent with a central reasoning core that leverages a set of specialized tools for image classification, lesion detection, patient metadata integration, and explainable AI. Through extensive experiments, the thesis evaluates the contribution of each component. First of all, the ability of multimodal language models, such as GPT-4o and Gemini, is analyzed both in classifying dermatological images with their own vision and in interacting with vision tools.
Furthermore, the research focuses on the integration of patient information via text embeddings into the classification model, to understand whether this data can enhance the performance of the tool
Tipo di pubblicazione
URI
![]() |
Modifica (riservato agli operatori) |
