
Jacopo Boccato
Manipulating specificity in biological sequences with representation learning.
Rel. Andrea Pagnani, Jorge Fernandez De Cossio Diaz, Simona Cocco, Remi Monasson. Politecnico di Torino, Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi), 2025
![]() |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (7MB) |
Abstract: |
The design and generation of biological sequences with desired functional properties is a central challenge in biology. This problem is an active area of research and has been approached using generative models. These models that rely on representation learning often learn an entangled representation of the data, making the generation of new sequences hard to control. To address this issue, we apply and extend an algorithm for Restricted Boltzmann Machines (RBMs) that enables control over learned features, allowing for the targeted generation of sequences with specific functional characteristics. The approach is tested on a case study, the Lattice protein model, and is generalized for use on a real biological example, the WW domain family. In both settings, we show that the learned representations capture interpretable modes of variability—such as electrostatic properties or binding preferences—which can be specifically manipulated while keeping the other properties unchanged. These results highlight the potential of this RBM-based algorithm and suggest several possible improvements and applications. |
---|---|
Relatori: | Andrea Pagnani, Jorge Fernandez De Cossio Diaz, Simona Cocco, Remi Monasson |
Anno accademico: | 2024/25 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 26 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-44 - MODELLISTICA MATEMATICO-FISICA PER L'INGEGNERIA |
Ente in cotutela: | IPhT (CEA-)Saclay (FRANCIA) |
Aziende collaboratrici: | CEA Saclay |
URI: | http://webthesis.biblio.polito.it/id/eprint/36401 |
![]() |
Modifica (riservato agli operatori) |