Context-Aware Organ Detection in Laparoscopic Surgery via Vision-Language Models

Antonio Martignano

Context-Aware Organ Detection in Laparoscopic Surgery via Vision-Language Models.

Rel. Alessio Sacco, Guido Marchetto, Flavio Esposito. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2026

PDF (Tesi_di_laurea) - Tesi
Accesso limitato a: Solo utenti staff fino al 27 Marzo 2029 (data di embargo).
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (25MB)

Abstract

The ability to accurately identify anatomical landmarks in real time is crucial to ensuring safety and navigating through laparoscopic surgery. Additionally, the reduced field of view, along with the significant similarity among tissues, creates a significant barrier to using multi-label classification techniques to detect the primary target organ and nearby organs during computer-assisted interventions. Although current state-of-the-art vision-language models, such as BiomedCLIP and its prompt learning variations (BiomedCoOp) have proven successful at treating classification as a visual pattern matching problem, they have largely ignored the strong spatial and semantic relationships inherent in surgical environments. We propose a new multi-modal framework called SpatialContext that utilizes a combination of frozen visual embeddings and spatial text descriptions derived from segmentation masks to integrate scene context into the classification process explicitly.

A new Context-Conditional Training strategy is developed where the model is forced to learn to predict the presence of secondary organs based on the context provided by the primary surgical target

Relatori

Alessio Sacco, Guido Marchetto, Flavio Esposito

Anno Accademico

2025/26

Tipo di pubblicazione

Elettronica

Numero di pagine

Corso di laurea

Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)

Classe di laurea

Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA

URI

https://webthesis.biblio.polito.it/id/eprint/39910

Modifica (riservato agli operatori)