Politecnico di Torino (logo)

Near-duplicate image detection for the insurance field

Arianna Parisi

Near-duplicate image detection for the insurance field.

Rel. Fabrizio Lamberti, Lia Morra. Politecnico di Torino, Corso di laurea magistrale in Ict For Smart Societies (Ict Per La Società Del Futuro), 2020


Near-duplicate detection aims at finding images from the same scene or place inside a wide collection. In this context, near-duplicate (ND) detection differs from the extensively studied image retrieval task because of its objective is finding different version of a same object in which the distinction between ND and non-ND pairs is done through a classification function while image retrieval is usually a search to retrieve semantically similar images based on a specific query. Near-duplicate discovery has many applications as social media evaluation, web content creation, image forensic and fraud detection. I created a machine learning method that exploits images' descriptors to learn similarities between image pairs. In particular, I developed a Siamese architecture composed by convolutional neural networks to extract image pairs' features and calculate similarity scores and pairs' distances. The dataset used to train the network is a private database owned by Reale Mutua Assicurazioni, formed by claims mostly regarding building damages. The final goal is to identify all possible near-duplicates, thresholding the output score of the model, starting from a set of queries. Because false positives have to be checked by the user, a high specificity is required. To evaluate results the problem is a binary classification of near-duplicate vs. not-near-duplicate pairs. Metrics are evaluated through Receiver Operating Curve (ROC) and a final Area Under the Curve score of 0.73 is achieved.

Relators: Fabrizio Lamberti, Lia Morra
Academic year: 2019/20
Publication type: Electronic
Number of Pages: 82
Additional Information: Tesi secretata. Fulltext non presente
Corso di laurea: Corso di laurea magistrale in Ict For Smart Societies (Ict Per La Società Del Futuro)
Classe di laurea: New organization > Master science > LM-27 - TELECOMMUNICATIONS ENGINEERING
Aziende collaboratrici: REALE MUTUA ASSICURAZIONI
URI: http://webthesis.biblio.polito.it/id/eprint/14391
Modify record (reserved for operators) Modify record (reserved for operators)