polito.it
Politecnico di Torino (logo)

Designing new Maximum Common Subgraph solvers: Heuristics, Multi-Threading and Graph Neural Networks

Marco Porro

Designing new Maximum Common Subgraph solvers: Heuristics, Multi-Threading and Graph Neural Networks.

Rel. Stefano Quer. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2023

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (9MB) | Preview
Abstract:

The Maximum Common Subgraph (MCS) is a complex theoretical computer science problem. It is a generalization of the Subgraph Isomorphism challenge and it finds numerous applications in chemistry, biology, medicine, and network management. This research focuses on improving the performance of the McSplit algorithm, a popular recursive Branch-and-Bound ground-truth MCS solver. The work aims to develop a tool that efficiently identifies the largest subgraph of a graph pair within a given time limit, addressing real-world scenarios, whereas finding the optimal solution is not necessarily the primary goal. A diverse range of strategies has been explored to identify the most effective approaches for future advancements, leading to the development of multiple prototype solvers belonging to three families. Firstly, a set of static sorting heuristics has been selected and deployed to the McSplit algorithm to establish a best-first vertex selection policy during the tree search. Then, these heuristics have been combined with a newly designed multi-thread parallel architecture, optimizing the allocation of processor time to promising branches of the recursive algorithm. Finally, Graph Neural Networks (GNNs) have been employed to identify and prioritize the most favorable branches at each intersection of the tree search, acting as a dynamic sorting heuristic. To assess the performance of these tools, we conducted extensive testing using open-source datasets representing Internet infrastructure networks and other relevant real-world applications. The evaluation of a meaningful set of performance metrics demonstrates that introducing the new heuristics provides valuable improvements over existing state-of-the-art MCS solvers in the targeted use cases, especially when the solvers are fine-tuned to the specific application domain. In conclusion, this research project has successfully enhanced the capabilities of the McSplit algorithm, providing notable advancements in solving the Maximum Common Subgraph problem for practical applications. This achievement has been recognized through the acceptance of our paper titled \textit{A web scraping algorithm to enhance maximum common subgraph computation} at ICSOFT 2023. Moreover, the analysis of the proposed methodologies, including sorting heuristics, parallel architecture, and Graph Neural Networks, offer valuable insights for future research to determine the most promising strategies among the explored approaches.

Relators: Stefano Quer
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 136
Subjects:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Ente in cotutela: UNIVERSITY OF ILLINOIS AT CHICAGO (STATI UNITI D'AMERICA)
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/27652
Modify record (reserved for operators) Modify record (reserved for operators)