polito.it
Politecnico di Torino (logo)

Real-time speech recognition using spiking neural networks

Bo Wang

Real-time speech recognition using spiking neural networks.

Rel. Stefano Di Carlo, Alessandro Savino, Alessio Carpegna. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2024

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (950kB) | Preview
Abstract:

SNN (Spiking Neural Networks) has long been a popular research area in deep learning, combining neuroscience and machine learning to create models that most closely simulate the mechanisms of biological neurons for computation. However, the model is still in the exploratory research phase. This paper explores the use of SNN models to complete a real-world application—Real-time Speech Recognition using Spiking Neural Networks. The STM32 MEMS microphone is used as the sound input in this study, and MFCC is applied to process the audio data, which is then converted into spike encoding for SNN training. The trained model is deployed on the PYNQ board for real-time speech recognition. Testing showed that the model achieved a recognition accuracy of up to 96.25%. This paper primarily utilizes a novel spike encoding method, significantly improving the recognition accuracy of SNN. This research explores the potential practical applications of SNN in the future.

Relatori: Stefano Di Carlo, Alessandro Savino, Alessio Carpegna
Anno accademico: 2024/25
Tipo di pubblicazione: Elettronica
Numero di pagine: 70
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-29 - INGEGNERIA ELETTRONICA
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/33322
Modifica (riservato agli operatori) Modifica (riservato agli operatori)