polito.it
Politecnico di Torino (logo)

Dynamical equilibrium selection by reinforcement learning algorithms: a study of two time-scale regimes in the El Farol Bar problem

Giuseppe Citino

Dynamical equilibrium selection by reinforcement learning algorithms: a study of two time-scale regimes in the El Farol Bar problem.

Rel. Luca Dall'Asta. Politecnico di Torino, NON SPECIFICATO, 2024

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (4MB) | Preview
Abstract:

Reinforcement learning (RL) algorithms are investigated in the context of the El Farol Bar problem. Adapting a RL algorithm designed for potential stochastic games to a non-stochastic scenario reveals intriguing dynamics governed by two distinct time scales. Numerical analysis demonstrates that the relative speeds of these time scales critically influence convergence behavior. Specifically, having a Q-function which evolves in a fast regime favors convergence to the (unique) symmetric mixed strategies Nash equilibrium. Conversely, when the fast scale is the one governing the learning of strategies, the convergence is instead to one of the pure strategies Nash equilibria. This fact highlights the intricate interplay between learning dynamics in RL algorithms and stability of equilibria, offering insights into their convergence properties.

Relatori: Luca Dall'Asta
Anno accademico: 2023/24
Tipo di pubblicazione: Elettronica
Numero di pagine: 76
Soggetti:
Corso di laurea: NON SPECIFICATO
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-44 - MODELLISTICA MATEMATICO-FISICA PER L'INGEGNERIA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/30870
Modifica (riservato agli operatori) Modifica (riservato agli operatori)