Split Point Selection in Privacy-Preserving Split Inference with Fully Homomorphic Encryption: A Performance Assessment

Michela Lucia Saraceno

Split Point Selection in Privacy-Preserving Split Inference with Fully Homomorphic Encryption: A Performance Assessment.

Rel. Valentino Peluso, Daniele Jahier Pagliari, Andrea Calimera. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025

Abstract:	The widespread adoption of Artificial Intelligence (AI) across various domains, such as healthcare, finance, and facial recognition, raises significant concerns regarding the privacy of sensitive user data. Indeed, AI services are often operated by third-party providers, requiring access to user data for processing, which exposes user information to potential privacy breaches. Recent data protection regulations, such as GDPR, further emphasize the need for privacy-preserving AI technologies. Fully Homomorphic Encryption (FHE) offers a promising solution by enabling arbitrary computation on encrypted data without requiring decryption. This property enables the development of privacy-preserving AI solutions, where service providers can process encrypted user data and return encrypted predictions, which users can decrypt using a private key, thus ensuring end-to-end data privacy. However, FHE introduces significant computational overhead that prevents the practical deployment of deep neural network models. This thesis addresses this issue through the use of split inference, where a deep neural network is partitioned into two segments: an unencrypted portion executed on the client side, and an encrypted portion computed on the server using FHE. This collaborative approach enables effective trade-offs between computational efficiency and model protection. The work presents a framework for systematically evaluating split point selection in deep neural networks encrypted using Homomorphic Encryption over the Torus (TFHE), by analyzing the impact of different split points and model bit-width configurations on both model accuracy and inference performance. Validation on two standard benchmarks, VGG11 and ResNet-32, demonstrates that selecting an optimal split point allows for preserving model accuracy while significantly reducing client-side computational load—a critical factor for battery-constrained edge devices. We define the optimal split point as the position in the neural network that satisfies two criteria: (1) an accuracy drop of less than 1% compared to the fully unencrypted model, and (2) client-side latency among the lowest across all evaluated split points. For VGG11, the optimal split occurs when the server performs private inference on the final 10 layers. This configuration achieves 88.82% accuracy, compared to 88.99% in the clear, and a latency of approximately 1.994 seconds, representing a 79.93× reduction compared to the worst-case latency. For ResNet-32, the optimal split involves the server running encrypted inference on the last 5 residual blocks, followed by average pooling and linear layers. This yields 89.10% accuracy (vs. 90.27% in the clear) and a latency of approximately 1.587 seconds, corresponding to a 6.72× reduction over the worst-case scenario. These results demonstrate that careful split point selection within a split computing framework can effectively balance privacy, model fidelity, and computational efficiency, advancing the practical deployment of privacy-preserving machine learning on resource-constrained edge devices.
Relatori:	Valentino Peluso, Daniele Jahier Pagliari, Andrea Calimera
Anno accademico:	2024/25
Tipo di pubblicazione:	Elettronica
Numero di pagine:	124
Informazioni aggiuntive:	Tesi secretata. Fulltext non presente
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Data Science And Engineering
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici:	NON SPECIFICATO
URI:	http://webthesis.biblio.polito.it/id/eprint/36330

Modifica (riservato agli operatori)