Antonio Martignano
Context-Aware Organ Detection in Laparoscopic Surgery via Vision-Language Models.
Rel. Alessio Sacco, Guido Marchetto, Flavio Esposito. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2026
|
|
PDF (Tesi_di_laurea)
- Tesi
Accesso limitato a: Solo utenti staff fino al 27 Marzo 2029 (data di embargo). Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (25MB) |
Abstract
The ability to accurately identify anatomical landmarks in real time is crucial to ensuring safety and navigating through laparoscopic surgery. Additionally, the reduced field of view, along with the significant similarity among tissues, creates a significant barrier to using multi-label classification techniques to detect the primary target organ and nearby organs during computer-assisted interventions. Although current state-of-the-art vision-language models, such as BiomedCLIP and its prompt learning variations (BiomedCoOp) have proven successful at treating classification as a visual pattern matching problem, they have largely ignored the strong spatial and semantic relationships inherent in surgical environments. We propose a new multi-modal framework called SpatialContext that utilizes a combination of frozen visual embeddings and spatial text descriptions derived from segmentation masks to integrate scene context into the classification process explicitly.
A new Context-Conditional Training strategy is developed where the model is forced to learn to predict the presence of secondary organs based on the context provided by the primary surgical target
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
URI
![]() |
Modifica (riservato agli operatori) |
