Alessandro Lovaldi
Predicting Deep Reinforcement Learning agents learning time for video game playing: a data-driven approach = Predicting Deep Reinforcement Learning agents learning time for video game playing: a data-driven approach.
Rel. Paolo Giaccone, Andrea Bianco. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2021
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) | Preview |
Abstract: |
In the last few decades, machine learning has made massive progress. This progress has made machine learning useful in a wide range of studies. One of the flourishing research filed is the one that applies machine learning to gaming. Countless reinforcement learning models have been created for a wide range of game genres. Many studies and applications make use of AI agents trained with one of those models. As an example, the work the inspired this thesis proposes to use an AI agent to assess the Quality of Experience in cloud gaming services. Training an agent from zero has some inconveniences. One of the problems is the high variance of the duration of the training phase. This variance is due to many factors. The main ones are the reinforcement learning model selected, the hardware used to run the training, and the complexity of the game. The goal of this thesis is to identify which characteristics make a game complex to be learned by an AI agent and how this complexity affects the time of learning. More precisely, we have studied if it is possible to predict how long it takes for a selected model to learn a game. This prediction is based solely on the game features. for our research, the games selected are Atari games and the model selected is Doble DQN. DDQN is a deep reinforcement learning algorithm able to play at a superhuman level Atari games. We have achieved our goal by modifying an existing DDQN model to gathered data from tens of Atari games during the training phase. The data collected describe two main aspects of the game: the shape of the reward signals and the visual component. The shape of the rewards is a key aspect of reinforcement learning. Reward frequency and magnitude can heavily influence the model performance. The visual component is considered because the DDQN uses as input the frame’s pixels. We then used unsupervised machine learning techniques, like regression analysis, to research the correlation between the game characteristics and the training duration. |
---|---|
Relatori: | Paolo Giaccone, Andrea Bianco |
Anno accademico: | 2020/21 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 48 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Aziende collaboratrici: | NON SPECIFICATO |
URI: | http://webthesis.biblio.polito.it/id/eprint/18075 |
Modifica (riservato agli operatori) |