Design of a Coverage-driven Reinforcement Learning Framework for RISC-V Functional Verification

Marco Rosa Gobbo

Design of a Coverage-driven Reinforcement Learning Framework for RISC-V Functional Verification.

Rel. Mariagrazia Graziano, Andrea Marchesin, Michele Caon, Maurizio Martina. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2024

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial Share Alike.
Download (3MB) | Preview

Abstract:	Verification in the context of digital design is the process of testing and validating the behavior of a system before it gets released or deployed. This is a fundamental part of the design process, often taking more than half of the development time due to the complexity of reaching complete coverage. Traditional verification techniques, such as directed testing and constrained random testing, often fail to capture critical edge cases in complex systems. To address this gap, this thesis explores the application of Reinforcement Learning (RL) for the functional verification of RISC-V cores, which are becoming increasingly popular, specifically through the automatic generation of assembly code to enhance test coverage. This investigation begins by building a test-bench for RISCV cores intended to be as implementation-independent as possible using the Universal Verification Methodology (UVM) in SystemVerilog (SV) and Spike instruction set simulator as the gold model. The test-bench is then translated into a Python-based environment using the PyUVM library and Verilator as the simulator to enable an open-source setup. This facilitates the integration with the rest of the components needed in the flow, such as the custom instruction generator and the coverage collection, providing a flexible framework for closed-loop instruction generation and core state observation. We introduce at this point the RL agent to guide the instruction generator based on coverage metrics and CPU state (e.g., register file and program counter). Because of the action space being so vast and never tackled before by other research works, the first agent implementation involves a custom-built RL agent, relying on Gymnasium to have a standard API towards the environment. It uses a deep Q-learning agent based on Neural Networks as the function approximators, divided in a state encoder and specialized child NN to avoid the explosion of the Action space size. The second approach uses Stable Baselines3 (SB3) library that provides established RL algorithms, including Proximal Policy Optimization and Multi Input Policy. For both cases different state vectors and reward functions are experimented with. Finally, we compare the post-training results obtained by the RL agent to the average coverages obtained by requesting random instructions to the instruction generator. The first agent approach does not show any improvements due to the NN not converging, caused by a naive implementation of the neural networks which leads to exploding weights and the loss values not decreasing. The second, SB3 approach, shows encouraging results. For instance, with 100 requests to the instruction generator (ca. 200 assembly instructions), an average coverage increase of 4.2% is observed compared to the random generation. The RL agent is able to generate diverse instruction sequences that stress different areas of the processor and show the presence of data dependencies in the generated code, thanks to the reward function promoting these behaviors. The work done provides solid foundation for future research while already having tackled some of the implementation options that showed a more efficient approach. The fully open-source nature of the framework represents a performant and versatile basis to further explore other machine learning approaches compared to proprietary solutions.
Relatori:	Mariagrazia Graziano, Andrea Marchesin, Michele Caon, Maurizio Martina
Anno accademico:	2024/25
Tipo di pubblicazione:	Elettronica
Numero di pagine:	80
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici:	NON SPECIFICATO
URI:	http://webthesis.biblio.polito.it/id/eprint/33024

Modifica (riservato agli operatori)