Subgraph Isomorphism: parallelizing dataflow of an FPGA based architecture

Giuseppe Salvatore Piazza

Subgraph Isomorphism: parallelizing dataflow of an FPGA based architecture.

Rel. Luciano Lavagno. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2024

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (2MB) | Preview

Abstract:	Subgraph isomorphism is a well-known NP-hard problem involving a large graph, referred to as the datagraph, and a smaller graph, known as the querygraph. In this context, both graphs have labeled vertices, and an isomorphism exists when one or more subgraphs of the datagraph contain nodes with labels that match those of the query graph’s nodes, and these nodes are connected by the same edges. Currently, numerous CPU-based algorithms are available for this matching process. To enhance speed, GPU-based algorithms can be considered; however, they demand significant power, especially when high-bandwidth communication channels are involved. To achieve lower power consumption while leveraging highly parallel architectures, an FPGA-based core may be a more effective solution. In this thesis, the LESS architecture, developed using Vitis HLS for Xilinx Kria boards, is optimized through parallelization to enhance performance. LESS implements the Multi Way Join (MWJ) algorithm, utilizing a hash table mechanism that is built starting from the query graph and enables efficient storage of data graph edges, thereby streamlining edge reading tasks. However, a significant bottleneck arises from DRAM access, as reading edges necessitates numerous read operations. The aim of this parallelization effort is to duplicate blocks that frequently access edge tables and distribute multiple instances of these blocks across various memory banks, ultimately achieving increased throughput. To enable parallelization by a factor of N, the preprocessing phase has been adjusted to account for the increased number of tables that need to be populated, which is now multiplied by N. This necessitates early identification of the destination processing branch for each edge in the data graph. In the experiment conducted, second-degree parallelism was explored, utilizing the most significant bit (MSB) of the node’s hash to designate the destination memory bank for each edge. The remaining bits of the hash continue to be employed as in previous implementation. Instances of the MWJ algorithm can operate independently, even when tasks are duplicated, because each instance is linked to a separate memory bank. Once each set of edges has been processed by its belonging branch, all data can be converged into single stream by an additional task which must mix streams in correct way. Another factor to consider during duplication is the format of streamed data across channels between tasks. Nearly half of the work involves decompressing certain streams (originally compressed to minimize streaming effort) and inserting extra tuples into specific streams to ensure data alignment across different channels.
Relatori:	Luciano Lavagno
Anno accademico:	2024/25
Tipo di pubblicazione:	Elettronica
Numero di pagine:	40
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering)
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-29 - INGEGNERIA ELETTRONICA
Aziende collaboratrici:	NON SPECIFICATO
URI:	http://webthesis.biblio.polito.it/id/eprint/34101

Modifica (riservato agli operatori)