polito.it
Politecnico di Torino (logo)

Quantization Analysis for Face Image Detection Through Dense Neural Networks

Alessandra Calzoni

Quantization Analysis for Face Image Detection Through Dense Neural Networks.

Rel. Guido Masera, Giovanni Ramponi. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2020

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview
Abstract:

Facial recognition is a technology capable of verifying the identity of a person through the analysis of his, or her, face. In recent years this biometric measurement has become increasingly popular: it is the most "natural" way to identify a person and it doesn't require any physical interaction with the person to be identified. Moreover, machine learning has greatly helped the development and the diffusion of face recognition, making it easier to develop and to implement. This technology can be used in a wide range of applications: for security services to prevent and detect crimes; for board checks to verify the passengers’ identity; on smartphones or computer for ID verification, also for critical application as bank account and mobile payment. The growing usage in mobile systems has required more efforts in the research of a trade-off between power consumption, which has to be limited, and accuracy, which can’t be lowered too much, especially for critical ID verification. Over time, the accuracy of neural networks has increased together with their complexity and their size. The low-power requirement has called attention to the need of systems less computationally and memory intensive. Many methods have been developed to reduce the parameters and the amount of computations: knowledge distillation and quantization are some examples. This thesis focuses on the analysis of different quantization approaches on two versions of a DenseNet network, both descended from a more complex architecture through knowledge distillation. Different data formats have been investigated, each with different pros and cons. Reduced bit-width versions of floating-point representation have been considered to reduce the power and memory efforts, without a degradation on the accuracy. To further reduce the computational power, the accuracy of the network has been analysed with respect to dynamic fixed-point and parameters as power of 2 representations. This last method leads to a “multiplier free” architecture. In addition, the large amount of batch normalization layers in the two architectures requires a huge attention on their quantization. Batch normalization involves highly power consuming operations, as multiplication and division. Furthermore, it can introduce high variations on the overall accuracy of the network. To manage these problems, two main methods have been considered. Folding of batch normalization layers on previous convolutional or fully connected layers is able to cancel the memory effort required by the parameters of the batch layer, thanks to a manipulation of the expressions to obtain the output features in the architecture during inference. The approach can be slightly modified to reduce the required parameters also for batch normalization layers that are far from convolutional or fully connected layers. Both the techniques don’t introduce any kinds of degradation on the accuracy. Another critical aspect related to batch normalization layer is the accuracy degradation after quantization. Uniform quantization isn’t able to limit this problem, so other kind of approaches have to be considered. A possible solution is the k-means clustering method, which divides the parameters of the network in clusters, each represented by a centroid, evaluated in an iterative way to reduce the square error of the parameters in its cluster. Future works could rely on this study and its solutions for the development of mobile hardware accelerators.

Relatori: Guido Masera, Giovanni Ramponi
Anno accademico: 2020/21
Tipo di pubblicazione: Elettronica
Numero di pagine: 89
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-29 - INGEGNERIA ELETTRONICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/16616
Modifica (riservato agli operatori) Modifica (riservato agli operatori)