Michele Gallina
A configurable data platform for streaming delta and full data ingestion.
Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2024
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (4MB) | Preview |
Abstract: |
The thesis project focuses on creating a cloud-based platform to manage large amounts of data in a secure, efficient, and dynamic way to meet current and future needs. The work was carried out using Apache Spark within Databricks and analyzing which framework best suited the various requirements. The platform is entirely cloud-based. Specifically, it was used Microsoft Azure. Being built using a cloud service allows for easy scaling, both up and down, to quickly respond to changes in data volume or adjust the processing time required. The use of Databricks provides a highly versatile platform based on Apache Spark, natively integrated with many frameworks to enable the creation of a system capable of meeting needs ranging from data ingestion and processing to the creation of complex dashboards and even the use of AI models. In particular, the thesis focused on creating a data platform for a security company to ingest data from two sources: a relational database and a network of IoT sensors. Once the data are stored on the platform, they undergo a quality improvement process to be made available to meet business needs. The platform was also designed to be as configurable as possible to make it easily extensible. Three company requirements were selected on the business side, and a solution was proposed for each. |
---|---|
Relatori: | Paolo Garza |
Anno accademico: | 2024/25 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 97 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Data Science And Engineering |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Aziende collaboratrici: | Cluster Reply Srl |
URI: | http://webthesis.biblio.polito.it/id/eprint/34023 |
Modifica (riservato agli operatori) |