Politecnico di Torino (logo)

Power and Area Optimization in Neural Receivers

Roberta Fiandaca

Power and Area Optimization in Neural Receivers.

Rel. Maurizio Martina, Guido Masera. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2023


The massive throughput increase in 6G wireless communication systems, due to the use of a wider spectrum and a large number of antenna elements, have driven a huge use of AI-based technologies to achieve high system performance. Prior art shows the outstanding performances of neural receivers compared to conventional ones but this comes with a high network complexity leading to a heavy computational cost. This poses a significant challenge in the deployment of these receivers on hardware-constrained devices, making optimization strategies to reduce the computational cost a primary concern. In this work, we focus on the optimization of a Neural Receiver through two main strategies: quantization and compression. The former technique reduces the computation precision with the goal of saving memory and computing hardware. We introduce both uniform, characterized by constant quantization steps, and non-uniform quantization strategies, with variable step sizes. Among the second ones, a relevant place is occupied by the Fibonacci Code word Quantization (FCQ), which consists in rounding a number with its closest Fibonacci code word, enabling the use of a simplified or-based multiplier. An Incremental Network Quantization (INQ) strategy, consisting in quantizing and retraining the network, is used to recover part of the accuracy loss due to quantization. We propose a fine grained INQ approach that, together with a careful combination of FCQ and uniform quantization, ensures to maintain good performance levels. Two novel lossless compression techniques are proposed to reduce the large amount of data involved in the network that would require a huge memory space. The combination of the two algorithms allows to effectively compress sequences of parameters that show a huge redundancy. The simplified or-based multiplier introduced through the FCQ, shows a 44% and 45% reduction in area and power consumption respectively, compared to a standard multiplier. Moreover, the combination of quantization and compression allows to save significantly the memory space: relatively to a simply 8-bit and 16-bit quantized network, the percentages of memory space saved with the optimized version are found to be 63% and 26%, respectively. The results provided in this work give a valuable insight for the development of efficient AI-based technologies that can be deployed on hardware-constrained devices.

Relators: Maurizio Martina, Guido Masera
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 76
Additional Information: Tesi secretata. Fulltext non presente
Corso di laurea: Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering)
Classe di laurea: New organization > Master science > LM-29 - ELECTRONIC ENGINEERING
Aziende collaboratrici: Nokia Bell N.V.
URI: http://webthesis.biblio.polito.it/id/eprint/26682
Modify record (reserved for operators) Modify record (reserved for operators)