Lucca Gamballi
Reinforcement Learning for Dynamic Scheduling.
Rel. Edoardo Fadda, Leonardo Kanashiro Felizardo. Politecnico di Torino, Master of science program in Ict For Smart Societies, 2025
|
Preview |
PDF (Tesi_di_laurea)
- Thesis
Licence: Creative Commons Attribution Non-commercial No Derivatives. Download (2MB) | Preview |
Abstract
This thesis investigates Reinforcement Learning (RL) for the Dynamic Job Shop Scheduling Problem (DJSSP), where agents make sequencing decisions under random job arrivals and tardiness is realized only upon job completion. This work argues that asynchronous per-machine decisions mitigate the credit-assignment challenge and assist training stability, motivating designs that explicitly align rewards with the causality of shop-floor events. The scheduler adopts a Centralized-Training and Decentralized-Execution (CTDE) scheme with parameter sharing and an event-driven policy that acts only at irregular decision epochs. This preserves local detail while remaining size-agnostic as queues fluctuate. State is constructed leveraging a “Minimal Repetition” encoder that packs the top job candidates of each machine into fixed slots with job-specific features, enabling direct job selection without fixing problem size.
The delayed reward is handled via a chronological joint-action pipeline: Transitions are buffered without reward and completed only when a job finishes, allocating a joint signal to the responsible agents in proportion to the queueing they induced
Relators
Academic year
Publication type
Number of Pages
Course of studies
Classe di laurea
URI
![]() |
Modify record (reserved for operators) |
