Hafez Ghaemi
Decentralized Value-Based Reinforcement Learning in Stochastic Potential Games.
Rel. Fabio Fagnani, Giacomo Como. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2022
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (4MB) | Preview |
|
|
Archive (ZIP) (Documenti_allegati)
- Altro
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (2MB) |
Abstract
Multi-agent reinforcement learning (MARL) is a promising paradigm for learning problems involving multiple decision makers. Contrary to centralized MARL with a central controller, decentralized (independent) MARL is more practical in terms of scalibility, privacy, and computational cost, yet more challenging due to non-stationarity of the environment from an agent’s perspective. The non-stationarity challenge arises as the evolution of the environment and the agent’s payoffs will depend on the behavior of other agents. In value-based MARL, two-timescale learning is shown to address this issue. In such a learning dynamics, agents update their value function estimates at a timescale slower than their local Q-function estimates, and therefore, the game is rendered locally stationary with respect to the strategy of other agents.
However, two-timescale dynamics in decentralized Q-learning has been studied only in two-player zero-sum games
Tipo di pubblicazione
URI
![]() |
Modifica (riservato agli operatori) |
