Reinforcement Learning for Dynamic Job Shop Scheduling: A Maskable PPO Approach

Marco Pozzebon

Reinforcement Learning for Dynamic Job Shop Scheduling: A Maskable PPO Approach.

Rel. Giulia Bruno. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Gestionale (Engineering And Management), 2025

PDF (Tesi_di_laurea) - Tesi
Accesso limitato a: Solo utenti staff fino al 27 Novembre 2028 (data di embargo).
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (7MB)

Abstract

This thesis addresses the Dynamic Job Shop Scheduling Problem (DJSSP), a critical challenge in modern manufacturing characterized by unpredictable arrivals, strict deadlines, and routing flexibility. Traditional heuristics often lack adaptability in such dynamic environments. To overcome these limitations, a reinforcement learning framework based on Maskable Proximal Policy Optimization (PPO) is proposed. Key features include multi-discrete action spaces for parallel machine decisions, action masking for feasibility, and a multi-objective reward function balancing lateness reduction, on-time delivery, throughput, and operational stability. Domain randomization during training enhances generalization across varying conditions. A real case study validates the framework, comparing two PPO agents with classical heuristics (EDD, FIFO, LPT, SLACK) and a metaheuristic (Genetic Algorithm).

Results show that RL agents consistently outperform classical heuristics, achieving lower lateness, higher on-time completion, and improved continuity

Relatori

Giulia Bruno

Anno Accademico

2025/26

Tipo di pubblicazione

Elettronica

Numero di pagine

Corso di laurea

Corso di laurea magistrale in Ingegneria Gestionale (Engineering And Management)

Classe di laurea

Nuovo ordinamento > Laurea magistrale > LM-31 - INGEGNERIA GESTIONALE

URI

https://webthesis.biblio.polito.it/id/eprint/38206

Modifica (riservato agli operatori)