polito.it
Politecnico di Torino (logo)

Analysis of protein families using a reduced space representation and the expectation propagation method

Alessandro Macis

Analysis of protein families using a reduced space representation and the expectation propagation method.

Rel. Andrea Pagnani, Anna Paola Muntoni. Politecnico di Torino, Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi), 2024

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview
Abstract:

In this study, we explore the space of different families of homologous protein sequences through the lens of dimensionality reduction, specifically employing Principal Components Analysis (PCA), a technique known for its effectiveness in handling such data. Our primary objective is to establish a mapping from the sequence space to a reduced dimensional space defined by the principal components of the data (protein family) covariance matrix. This reduction aids in streamlining the analysis while preserving the relevant statistical (and hopefully biological) information. To accomplish this, we utilize the Expectation Propagation (EP) algorithm. This method allows us to estimate the probability distribution of a sequence that fits the imposed projection constraints, driving the resolution of the corresponding inverse problem. By employing EP, we gain insights into the reduced dimensional space, enhancing our ability to interpret and work within it effectively. The utility of this approach becomes evident as we apply it to various analyses of protein families within the reduced space. This reduced space retains the essential biological features inherent in the original data. We applied the method to sample and characterize the Potts energy landscape within the reduced dimensional space.

Relatori: Andrea Pagnani, Anna Paola Muntoni
Anno accademico: 2023/24
Tipo di pubblicazione: Elettronica
Numero di pagine: 52
Soggetti:
Corso di laurea: Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-44 - MODELLISTICA MATEMATICO-FISICA PER L'INGEGNERIA
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/31436
Modifica (riservato agli operatori) Modifica (riservato agli operatori)