polito.it
Politecnico di Torino (logo)

Statistical methods for multi-omics data integration: a study on Ehlers-Danlos syndrome

Alice Zorzan

Statistical methods for multi-omics data integration: a study on Ehlers-Danlos syndrome.

Rel. Enrico Bibbona. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Matematica, 2025

[img] PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (4MB)
Abstract:

Rare diseases represent a significant challenge for biomedical research due to limited data availability and complex molecular interactions. Multi-omics integration emerges as a promising strategy to overcome individual omics limitations and provide comprehensive biological insights. This work presents a comparative analysis of multi-omics integration approaches applied to Ehlers-Danlos Syndrome (EDS), a group of hereditary rare diseases characterized by collagen production defects. The research operates on three analytical levels: first, single omics analysis, performed on transcriptomics and proteomic data, followed by multiomics integration through statistical techniques. All the levels of analysis are performed on bulk (2D) and spheroid cell (3D) cultures, to capture shared information and compare possible differences. Data were collected from fibroblast cultures of 14 patients (10 patients with disease and 4 healthy controls) under both 2D and 3D conditions. Transcriptomics analysis was mainly focused on the use of DESeq2 framework, based on negative binomial generalized linear models, while proteomics analysis compared three statistical approaches: classical ANOVA, linear models for micro-array data (limma) and non-parametric Wilcoxon test. After this, multi-omics integration was performed, by using three complementary methodologies: Multi-Omics Factor Analysis (MOFA), for identifying shared latent factors across the omics layers; iClusterPlus, that aims to integrate and cluster samples through Bayesian latent variables; and finally Similarity Network Fusion (SNF), for patient similarity network fusion. Each approach was evaluated for its ability to merge multiple omics layers together by using different approaches and for identifying biologically relevant genes. Results show the potential of integration techniques to capture molecular patterns, providing biological insights regarding the most significant genes. The analysis demonstrates how multi-omics integration can reveal further biological insights, that are not always accessible through single-omics approaches, supporting the identification of potential therapeutic targets. The comparative and systematic evaluation performed on the different methods emphasizes both the strengths and limitations of each integrative approach, contributing to a deeper understanding of their applicability when applied to rare diseases.

Relatori: Enrico Bibbona
Anno accademico: 2025/26
Tipo di pubblicazione: Elettronica
Numero di pagine: 84
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Matematica
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-44 - MODELLISTICA MATEMATICO-FISICA PER L'INGEGNERIA
Aziende collaboratrici: FONDAZIONE TELETHON ETS
URI: http://webthesis.biblio.polito.it/id/eprint/37193
Modifica (riservato agli operatori) Modifica (riservato agli operatori)