Lorenzo Martini
Study of cellular heterogeneity of mouse cerebral cortex, through joint scRNA-seq and scATAC-seq analysis, derived from SNARE-seq technique.
Rel. Stefano Di Carlo, Roberta Bardini. Politecnico di Torino, Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi), 2020
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (2MB) | Preview |
Abstract: |
Single-cell RNA sequence analysis is part of Next Generation Sequencing(NGS) and allows investigating the gene expression profile of thousands of cells simultaneously. Through this experiment, one can study the cellular heterogeneity and try to find new rare cell types. However, despite the continuous growth of this technology, there are some problems, especially with the cross-validation of the results. For this reason, this thesis work aimed to explore the current state of the art and find possible alternatives. In this regard, it has been found a recent technique called SNARE-sequencing, which from the same sample of cells makes not only scRNA-seq but also single-cell ATAC sequencing. ScATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing) is an epigenetic analysis technique to assess genome-wide chromatin accessibility i.e. allows studying the chromatin state and the accessibility of the genes. The two pieces of information describe different biological cellular mechanisms but are complementary, so the joint analysis could improve the cellular heterogeneity analysis. The dataset provided by the SNARE-seq has been used to work. It consists of a collection of 10309 cells from samples of adult mice brain cortex. To process the dataset has been utilized two well-known pipelines, Monocle and Seurat. At first, it was performed the analysis of the gene expression matrix separately. Using, firstly, Seurat and then Monocle, the data have been elaborated to obtain classification based on unsupervised clustering machine learning algorithms. The result was a partition in 21 clusters in agreement with what was found by the SNARE researchers. Next, the same processing has been done with the accessibility data, through Cicero and Signac, companion packages of the previous ones. Besides the similar analysis done for the expression data, Cicero provides the possibility to estimate the co-accessibility score of the data and to find cis-regulatory networks(CCAN) that can be important to understand regulatory mechanisms like enhancer-promoter. Signac, instead, provides tools for motif analysis and especially ways to integrate scATAC-seq data with scRNA-seq. Before proceeding to the joint analysis, it was necessary to establish a reference classification of the cells, to compare the unsupervised cluster partitions, and make sure that the algorithms were recognizing cellular heterogeneity and not some other features. It has been done an independent classification of each cell through the exploration of the expression of known markers. After the separate analyses, the study focused on the correlation between the results, trying to understanding the relations between expression and accessibility of notable genes, like cluster markers. The first approach has been to overlap the classifications derived from the separate dataset to find the sensible differences in the cluster partition. This has shown how, even if the overall classifications agreed, some clusters were subdivided. The second approach has been to create a gene activity matrix from the accessibility data to examine directly the overall accessibility of the genes, instead of only the peaks. Last it has been investigated the relation between expression of differentially expressed genes and their accessibility, and vice versa, through visualization Signac tools. In conclusion, the joint analysis helps to look at the problem with a wider view that can improve the investigation. |
---|---|
Relatori: | Stefano Di Carlo, Roberta Bardini |
Anno accademico: | 2020/21 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 87 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-44 - MODELLISTICA MATEMATICO-FISICA PER L'INGEGNERIA |
Aziende collaboratrici: | NON SPECIFICATO |
URI: | http://webthesis.biblio.polito.it/id/eprint/16750 |
Modifica (riservato agli operatori) |