polito.it
Politecnico di Torino (logo)

Optimization of an RNA coarse-grained force field with Machine Learning

Gianluca Lombardi

Optimization of an RNA coarse-grained force field with Machine Learning.

Rel. Alessandro Pelizzola, Samuela Pasquali, Frédéric Lechenault. Politecnico di Torino, Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi), 2022

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview
Abstract:

Ribonucleic acid, or RNA, is a linear biological polymer involved in a wide variety of functions, both in human and in viral cells, as highlighted also by the recent pandemic. Functionality of RNA molecules is strongly linked to their three-dimensional structure, which is the reason why the RNA folding problem has become of great interest in recent years. Differently from proteins, RNA structures are less stable and heavily depend on the biochemical conditions of the surrounding environment. Among the proposed solutions to the problem, one possible approach relies on the design of coarse-grained physical models to speed up molecular dynamics simulations for RNA folding, that would be otherwise too expensive in term of computational cost . Among these models, HiRE-RNA is a high resolution force field, where each nucleotide is represented as 6-7 beads and with specific functional forms for the interactions, designed to reproduce experimental results concerning simulations and extraction of thermodynamical quantities. The interactions implemented in the force field can be divided in local interactions, related to bonded particles, and long-range interactions, such as stacking, planarity and hydrogen bonding between nitrogenous bases. In this internship project, we aimed at optimizing HiRE-RNA parameters related to local interactions, such as couplings and equilibrium values for bonds, angles and torsions, building a machine learning setup from scratch. Starting from a limited number of heterogeneous RNA sequences, we created a suitable dataset and we developed a machine learning model that computes the energies with HiRE-RNA. The optimization is performed applying Stochastic Gradient Descent, with a loss function designed to match coarse-grained energies with those computed in the atomic representation with Amber software. Due to the large number of parameters and the constraints imposed by their physical meaning, different methods were introduced, varying the degrees of freedom of the system and the functional forms of the interactions. The different implementations were then compared, through the analysis of the distributions of some physical quantities extracted from molecular dynamics simulations, such as bond lengths or angles. Although the obtained results are not definitive, they already provide an improvement in the model performance and constitute a good starting point for future developments. Indeed, we were able to obtain the correct energy scale for the considered interactions, which allowed to perform better molecular dynamics simulations on short sequences, for which long-range interactions play a negligible role. Moreover, we were able to detect the terms that most influenced the result and to make some corrections that were required to improve their effect on the global structure. However, some issues were not solved, like the correlation of some final parameters on the initial values. The optimization of terms related to long-range interactions, instead, will be part of future works and will involve different methods and more complex models, but it will indeed benefit from our first results, that showed the potential and some limitations of this approach.

Relatori: Alessandro Pelizzola, Samuela Pasquali, Frédéric Lechenault
Anno accademico: 2022/23
Tipo di pubblicazione: Elettronica
Numero di pagine: 38
Soggetti:
Corso di laurea: Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-44 - MODELLISTICA MATEMATICO-FISICA PER L'INGEGNERIA
Aziende collaboratrici: Ecole Normale Superieure
URI: http://webthesis.biblio.polito.it/id/eprint/24520
Modifica (riservato agli operatori) Modifica (riservato agli operatori)