Francesco Caredda
Attention Based Direct Coupling Analysis for Protein Structure Prediction.
Rel. Andrea Pagnani. Politecnico di Torino, Master of science program in Physics Of Complex Systems, 2022
|
Preview |
PDF (Tesi_di_laurea)
- Thesis
Licence: Creative Commons Attribution Non-commercial No Derivatives. Download (22MB) | Preview |
Abstract
Proteins are at the base of every biological function within the cell, ranging through a variety of transport, signaling and enzymatic tasks. Their functionalities heavily rely on their three-dimensional structure which is extremely difficult, time consuming and expensive to determine. In this thesis we discuss Direct Coupling Analysis (DCA), the state-of-the-art statistical physics model used to learn structural information about co-evolving proteins based on their amino-acid sequence. Phylogenetically related homologous sequences can be considered as belonging to a unique protein family with specific structural properties defining their functionality. For our purposes such sequences, aligned and collected in a data structure called Multiple Sequence Alignment (MSA), can be thought as samples drawn from a probability distribution encoding the fundamental structural traits of the protein family they belong to.
The form of the distribution is obtained by applying a Maximum Entropy Principle imposing as empirical constraints the single and pairwise frequency counts of the amino-acids in the MSA
Relators
Academic year
Publication type
Number of Pages
Course of studies
Classe di laurea
URI
![]() |
Modify record (reserved for operators) |
