Flavio Spuri
Optimizing Genome Representations for Cancer Type Classification.
Rel. Alfredo Benso. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2024
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (3MB) | Preview |
Abstract
In recent years Large Language Models (LLMs) have been successfully adapted to the field of Genomics, as shown by models such as DNABERT, DNABERT-2, and Nucleotide Transformer. Despite this, their application in the challenging field of Cancer Genomics remains unexplored. This thesis examines whether cancer genome analysis can benefit from large, pre-trained Transformer-based models, specifically focusing on the newly introduced HyenaDNA architecture. In HyenaDNA, traditional Attention Layers are replaced by so-called Hyena Filters, which consist of recursions of an element-wise multiplicative gating and a long convolution, allowing for the processing of longer sequences while maintaining single-base resolution, and achieving a subquadratic computational cost, aligning well with the specific needs of Cancer Genomics.
This study begins by assessing HyenaDNA's capabilities to represent genomic data
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
Ente in cotutela
URI
![]() |
Modifica (riservato agli operatori) |
