Practical Evaluation of DDPG, TD3, and SAC for HVAC Control: A Comparative Study of Training Methods and Deployment Strategies

Filippo Bertolotti

Practical Evaluation of DDPG, TD3, and SAC for HVAC Control: A Comparative Study of Training Methods and Deployment Strategies.

Rel. Lorenzo Bottaccioli, Pietro Rando Mazzarino. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (3MB) | Preview

Abstract

Heating, Ventilation, and Air Conditioning (HVAC) systems account for a significant proportion of the world's total energy consumption. In recent years, research in the HVAC area has focused on developing new control systems based on artificial intelligence, particularly reinforcement learning. Reinforcement learning algorithms are well suited to the prior objective of HVAC: reducing energy consumption while maintaining good thermal comfort. However, the commercial application of such technology is still uncommon. This study aims to investigate the most effective training methods for algorithms with a view to future real-world applications. Three different reinforcement learning algorithms (DDPG, TD3, and SAC) were trained in a simulated residential environment using Energym, each in three different ways: online as is, offline, and offline with online fine-tuning.

The resulting agents were compared to find the combinations that yielded the best results, such as the convergence period, loss of thermal comfort or energy during training, and, of course, the capability to perform better than a widespread controller, such as PID, during the testing phase