Babelle Tchoumi Yomi
Development and Orchestration of a Scalable and Efficient Automated Data Ingestion Workflows and Pipelines for Multi-Domain at MSC Technology Italia.
Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (5MB) | Preview |
Abstract
This thesis was conducted at MSC Technology Italia as part of an enterprise-wide initiative to modernize and automate data ingestion processes across key business domains, including Finance, CRM, Logistics, Operations, and Liners. In the current hybrid architecture, Informatica PowerCenter is used to orchestrate ingestion workflows from various structured data sources. Azure Synapse Analytics supports the design and execution of cloud-native data pipelines, particularly for systems based on Oracle. This division reflects MSC’s progressive shift from traditional ETL to scalable, cloud-based processing. The project also aligns with MSC’s strategic objective to migrate toward Microsoft Fabric, a unified analytics platform built on a lakehouse model.
As part of this transition, Fabric pipelines were developed using notebooks, PySpark, Script activities, and Copy activities to demonstrate modern ingestion and transformation capabilities
Tipo di pubblicazione
URI
![]() |
Modifica (riservato agli operatori) |
