Design and implementation of a real time data lake in cloud
Vincenzo Siciliani
Design and implementation of a real time data lake in cloud.
Rel. Tania Cerquitelli. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2021
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (4MB) | Preview |
Abstract
This thesis is based on my work experience in a project carried out by NTT Data Italia, for one of its major client in the media sector, to design and implement a real time Big Data Platform on a cloud environment. The goal of this project is to guide the client in the evolution of his technologies for the management of the data migrating from an architecture based on several data warehouses to a single data lake that centralized all data. This new platform allow the client's business users to perform their analyses more easily, quickly and accurately and enable data scientist to develop their prediction models joining data from different data sources and departments.
To achieve these results we have analyzed the AS-IS architecture of the client's databases, the final requirements and how to implement them on a cloud platform such as Google Cloud Platform of which the client is a partner, using all the features of tools made available by the cloud provider in terms of availability, scalability, security and cost optimization
Relatori
Tipo di pubblicazione
URI
![]() |
Modifica (riservato agli operatori) |
