Matteo Gambino
Optimizations and efficient retrieval solutions for large-scale visual geo-localization problems.
Rel. Carlo Masone. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2022
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (4MB) | Preview |
Abstract: |
Visual geo-localization is the task of determining the location where a photo was taken, exploiting only visual information. This task plays an important role in numerous applications, such as in the categorization of images for photo collections, augmented reality, and for the localization of mobile robots, therefore it is an active area of research. The task is commonly approached as an image retrieval problem: given a query, its location is inferred by performing a similarity search, via k-nearest neighbour (kNN) over a database of geotagged images. While this solution allows to achieve remarkable results in moderately sized problems, it does not scale well to large maps. This is due to two problems: i) the execution of the kNN requires to keep in memory the embeddings extracted from all the images in the database; as the database increases, the required memory can quickly become infeasible; ii) the time required to perform the kNN grows linearly with the dimension of the database; as the database grows, the latency for processing a single query my become not sustainable. This thesis addresses these problems and it investigates various solutions to improve the retrieval pipeline of visual geo-localization methods. In particular: • it investigates the impact of using advanced indexing techniques in the similarity search, such as inverted file indexes and product quantization. The effect of these techniques is evaluated in terms of the accuracy of the final results, memory footprint and time required to perform the retrieval. The results of this investigation not only demonstrate that using appropriate indexing techniques can enable VG problems to scale to large database, with minimal loss in accuracy, but they can also serve as a guideline for developers to choose the right solution based on their required trade-off between resources and performance. • It presents a novel solution for filtering the database images based on their semantic content, in order to reduce the search space for the similarity search, thus ultimately decreasing its memory and time complexity. All these analyses and solutions have been developed on realistic urban datasets of various scales. |
---|---|
Relatori: | Carlo Masone |
Anno accademico: | 2022/23 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 91 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Aziende collaboratrici: | NON SPECIFICATO |
URI: | http://webthesis.biblio.polito.it/id/eprint/25610 |
Modifica (riservato agli operatori) |