Paolo Iannino
NoSQL Data Lake: Search Engine, Analytics and Machine Learning.
Rel. Paolo Garza. Politecnico di Torino, Master of science program in Computer Engineering, 2018
Abstract
The project aims at building a data lake providing a comprehensive set of analytical tools for an R&D team of Amadeus, the leader IT provider for the travel industry. The system targets different data sources related to the reissue of a flight ticket, which are processed to achieve three main objectives: a search engine, a statical framework and a business intelligence tool. The previous goals are mapped to three different tasks: the development of an efficient preprocessing phase, the proper organization of the storage and the graphical user interface, and the elaboration of a machine learning solution. One of the main contribution is the use of the current state of the art technologies in terms of scalable data processing and data storage.
Indeed, all the preprocessing phase is performed through the Spark framework, while the chosen database, which is the functional core of the system, employs a NoSQL approach
Relators
Academic year
Publication type
Number of Pages
Additional Information
Course of studies
Classe di laurea
Ente in cotutela
Aziende collaboratrici
URI
![]() |
Modify record (reserved for operators) |
