polito.it
Politecnico di Torino (logo)

Similarity Join Queries: Techniques and Optimizations

Simone Zanella

Similarity Join Queries: Techniques and Optimizations.

Rel. Tania Cerquitelli. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2023

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview
Abstract:

In this thesis we are going to propose efficient algorithms and data structures to handle similarity join queries over any number of constant relations in the dynamic setting with delay guarantees. We provide a lower bound complexity proof for the dynamic case, in which points are inserted and deleted after an initial preprocessing phase where all the needed data structures are created. We analyze special cases in which constraints allow us to efficiently answer a similarity join query exactly. Then, we design a grid-based approximation data structure for any dynamic similarity join query proving delay guarantees. A reduction to equi-joins is also provided. Finally an implementation of similarity join approximation algorithm with tests on different use cases is added at the end in appendices section.

Relators: Tania Cerquitelli
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 72
Subjects:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Ente in cotutela: UNIVERSITY OF ILLINOIS AT CHICAGO (STATI UNITI D'AMERICA)
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/28022
Modify record (reserved for operators) Modify record (reserved for operators)