Politecnico di Torino (logo)

A Data Driven Approach to Remaining Time Prediction of Process Instances

Marco Di Nepi

A Data Driven Approach to Remaining Time Prediction of Process Instances.

Rel. Silvia Anna Chiusano. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2021

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (4MB) | Preview

Large companies usually keep track of internal processes by continuously updating data in a database in the form of logs. They are crucial to carry out conformance checks and monitor whether a case is progressing as expected and similarly to what has happened in the past or, in alternative, detect any errors or unexpected loops that can negatively affect the performances of a system. Predictive process monitoring collects a set of techniques and methodologies to analyze event logs, with the purpose of making predictions on running cases. Being able to predict in real time the remaining time until the completion of a case is crucial to allow the user to intervene promptly. A fast response guarantees a reduction in the risk of delays and slowdowns in the entire workflow, which may occur in any moment, and an increased awareness on the presence of behaviors that differ from the normal trend. The problem can be treated as a supervised learning task and in this paper we propose a methodology based on neural network models. In particular, given the log structure as an ordered sequence of events, it comes natural to exploit architectures able to manage data sequences and very long dependencies, such as recurrent and attentions-based architectures. The goal is to integrate and optimize the application maintenance service provided by the company through machine learning algorithms. The case study, the forecasting of job completion time in an HPC system, covered the entire process from data acquisition to model deployment and the development of a dedicated web application to provide, in addition to the prediction, other useful features for improving the system.

Relators: Silvia Anna Chiusano
Academic year: 2020/21
Publication type: Electronic
Number of Pages: 78
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: Technology Reply Srl
URI: http://webthesis.biblio.polito.it/id/eprint/19178
Modify record (reserved for operators) Modify record (reserved for operators)