Politecnico di Torino (logo)

Resilient Deep Neural Network for FPGA space applications

Francesco La Carpia

Resilient Deep Neural Network for FPGA space applications.

Rel. Luciano Lavagno, Mario Roberto Casu, Mihai Teodor Lazarescu, Filippo Minnella. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2024

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (17MB) | Preview

The effect of radiation on electronic devices can generate errors on various scales that can be catastrophic. In space missions, satellite devices are often used to collect numerous pieces of information on board before transmitting them to Earth. The loss of this information would lead to a high waste of resources and the failure of the entire space mission. The use of resilience techniques is aimed at preventing such errors in a radiation-rich environment such as space. The adoption of machine learning and artificial intelligence techniques in edge computing systems is growing due to their efficiency, reduced latency and enhanced decision-making capabilities. Going into more detail, FPGAs are dataflow accelerators that compute operations in parallel, enabling the execution of numerous computations with low energy consumption. Moreover, being programmable devices, it is possible to reconfigure the device multiple times, ensuring high levels of flexibility. For the reasons just mentioned, convolution operations present in many deep learning algorithms are a perfect fit for such devices. The beginning of this study involved a thorough analysis of the potential hazards that such an environment may pose to an FPGA platform, with a specific focus on Single Event Upsets (SEUs). The manifestation of such errors in SRAM-based FPGAs can lead to malfunctions at varying degrees, often with consequences ranging from moderate to severe. The study then proceeded to analyze the solutions most used to address these types of problems. Solutions include system-level, structure-level, individual cell design of logic elements and FPGA configuration netlist design: the most common methodologies are ECC, TMR and partial or total reconfiguration of the configuration memory. Currently, these solutions provide good resilience to errors caused by radiation. It is possible to combine various techniques that exploit the intrinsic resilience of neural networks with customized hardware and software solutions to maintain performance levels unchanged. Following this preliminary analysis, it was possible to develop a ship detection application using satellite images. The algorithm of choice for this task is SSD, allowing real-time recognition of objects at varying scales by using custom-designed default boxes (or prior boxes), which dimension depends on the sizes of the objects to be identified. The network backbone was quantized using the Brevitas framework with 8-bit quantization for weights and activations and 16-bit for biases. Subsequently, I simulated in software the presence of bit-flip errors at feature maps level, evaluating scenarios where errors result in catastrophic consequences in the network's inference. Error injection was performed using PyTorch Hook functions, allowing access to intermediate modules of the network during the inference process. This approach allows observation of differences in network performance based on the type of bit being flipped (MSB or LSB). The thesis was conducted in collaboration with AIKO, an emerging company that produces software technologies for space applications.

Relators: Luciano Lavagno, Mario Roberto Casu, Mihai Teodor Lazarescu, Filippo Minnella
Academic year: 2023/24
Publication type: Electronic
Number of Pages: 100
Corso di laurea: Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica)
Classe di laurea: New organization > Master science > LM-25 - AUTOMATION ENGINEERING
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/30966
Modify record (reserved for operators) Modify record (reserved for operators)