Virginia Marcante
Scaling Approximate Linear Programming: Langevin Based High-Dimensional Sampling for Constraint Violation Learning.
Rel. Sandra Pieraccini. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2023
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) | Preview |
Abstract
Approximate Linear Programs (ALPs) are fundamental models for computing value function approximations and bounds for high-dimensional Markov decision processes (MDPs) arising in a wide range of applications. ALP has a manageable number of variables and a large number of constraints. Constraint generation and sampling are two traditional approaches to handle the numerous constraints. The former approach has limited applicability. The latter approach, while broadly applicable, does not guarantee a valid bound on the optimal policy value. Constraint violation learning is a recent approach for solving ALP that combines first-order methods with Metropolis Hastings sampling to provide a general purpose approach that retains the bounding property of ALP and has convergence guarantees.
Its use of Metropolis Hastings however limits its scalability
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
Ente in cotutela
URI
![]() |
Modifica (riservato agli operatori) |
