Jacopo Boccato
Manipulating specificity in biological sequences with representation learning.
Rel. Andrea Pagnani, Jorge Fernandez De Cossio Diaz, Simona Cocco, Remi Monasson. Politecnico di Torino, Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi), 2025
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (7MB) | Preview |
Abstract
The design and generation of biological sequences with desired functional properties is a central challenge in biology. This problem is an active area of research and has been approached using generative models. These models that rely on representation learning often learn an entangled representation of the data, making the generation of new sequences hard to control. To address this issue, we apply and extend an algorithm for Restricted Boltzmann Machines (RBMs) that enables control over learned features, allowing for the targeted generation of sequences with specific functional characteristics. The approach is tested on a case study, the Lattice protein model, and is generalized for use on a real biological example, the WW domain family.
In both settings, we show that the learned representations capture interpretable modes of variability—such as electrostatic properties or binding preferences—which can be specifically manipulated while keeping the other properties unchanged
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
Ente in cotutela
Aziende collaboratrici
URI
![]() |
Modifica (riservato agli operatori) |
