polito.it
Politecnico di Torino (logo)

New tensor network contraction techniques and algorithms for the evaluation of marginals in probabilistic graphical models

Matteo Simeone

New tensor network contraction techniques and algorithms for the evaluation of marginals in probabilistic graphical models.

Rel. Alfredo Braunstein, Stefano Crotti. Politecnico di Torino, Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi), 2024

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (4MB) | Preview
Abstract:

Evaluating properties of a probabilistic model, such as means, variances and correlations from probability distributions is inherently challenging due to its complexity, here defined as the number of operations and memory allocations required, which often scale exponentially in the number of variables involved. Indeed, just thing about the Ising model and the Potts model, heavily used in the statistical physics context and related fields, for which the evaluation of the partition function Z, necessary for the evaluation of marginals, requires an exponential number of operations, i.e. for N Ising spins the calculation of Z requires 2^N operations. Tensor networks (TNs) offer a novel approach to handle probability distributions, enabling the use of optimization techniques and heuristic strategies to significantly reduce the complexity of the calculation of marginals and normalisation constants of whatever the probabilistic model of interest is. The key tools in the TN framework allow to map the calculation of a property of interest of a probabilistic model into a TN contraction problem, where the balance between the precision on the result and the complexity of the contraction is tunable through input parameters chosen by the user. In my thesis I present two algorithms I've written from scratch in Julia programming language, namely MCE algorithm and RRC algorithm, in addition to a deep analysis of the TN’s methodologies and the already existing methods and algorithms at disposal in the literature. Two sections are devoted to simulations of the calculation of the partition function of different Ising models on different underlying graphs, in which errors and computational times are evaluated and analysed as a function of the input parameters. In the end, a final section is devoted to a biophysics application which is the inverse problem of determining Potts model parameters starting from experimental observations, known as Direct Coupling Analysis: it is a statistical framework designed to capture the variability within a group of phylogenetically related biological sequences. This method allows to determine the native contacts of proteins using multiple sequence alignment of homologous proteins, relying on evolutionary conservation and statistical correlations. The main results of my thesis are: 1. The two algorithms I developed have computational times which are always polynomial in the number of variables of the probabilistic model of interest. The precision on the results and the related computational times are tunable via 3 input parameters chosen by the user. These algorithms are general, i.e. they work with whatever probability distribution, still being polynomial. In the case of the Ising model's simulations, both the average magnetization and the partition function are evaluated in a single run of the two algorithms, without doing derivatives. 2. A map allowing to transform the problem of contracting a generic TN into the problem of contracting a square lattice TN, for which the computational times scale at most like N^{1.5} using RRC algorithm, where N is the number of original variables of the model of interest. 3. An analysis and extension of some interesting existing algorithms and techniques already used in papers in the literature.

Relatori: Alfredo Braunstein, Stefano Crotti
Anno accademico: 2024/25
Tipo di pubblicazione: Elettronica
Numero di pagine: 74
Soggetti:
Corso di laurea: Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-44 - MODELLISTICA MATEMATICO-FISICA PER L'INGEGNERIA
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/33080
Modifica (riservato agli operatori) Modifica (riservato agli operatori)