Enabling Edge AI: Synthetic Data Generation and Supervised Fine-Tuning for Small Language Models

Davide Vitabile

Enabling Edge AI: Synthetic Data Generation and Supervised Fine-Tuning for Small Language Models.

Rel. Antonio Jose' Di Scala, Subash Subavignesh Nachimuthu. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025

Abstract:	The rapid development of Large Language Models (LLMs) has revolutionized natural language processing, enabling sophisticated human-computer interactions. However, the widespread deployment of these models in cloud environments raises significant concerns about privacy, data security and computational efficiency. This thesis, conducted at Tether's Data Division, focuses on the development of privacy-preserving AI solutions through the creation of efficient Small Language Models (SLMs) with 1-3 billion parameters, optimized for local deployment on edge devices. This research provides two main contributions to the development of high performance SLMs. Firstly, it introduces novel pipelines for high-quality post-training synthetic data generation, which is essential for improving model performance in following instructions while reducing hallucinations. This approach includes an instruction tuning pipeline featuring generation, diversity, and validation stages, as well as a specialized zero-shot chain-of-thought pipeline to improve reasoning capabilities. These pipelines leverage sophisticated LLMs and advanced prompting techniques to create diverse, task-specific datasets while maintaining high standards of coherence and accuracy. Secondly, this work details a comprehensive supervised fine-tuning (SFT) methodology for transforming base models into instruction-following assistants. The implementation uses DeepSpeed for distributed training across multiple NVIDIA H100 GPUs and incorporates parameter-efficient fine-tuning techniques such as Low-Rank Adaptation (LoRA). In addition, this work explores the crucial distinction between base models and instruct models, the importance of chat templates in model-human interaction, and presents detailed analyses of hyperparameter optimization for optimal model performance. The methodologies developed in this thesis present effective approaches for improving and creating small-scale language models that can run locally while maintaining high performance. Through these advancements, this work contributes to the growing field of Edge AI by providing privacy-conscious alternatives to cloud-based solutions, addressing the increasing demand for efficient, secure, and locally deployable AI systems.
Relatori:	Antonio Jose' Di Scala, Subash Subavignesh Nachimuthu
Anno accademico:	2024/25
Tipo di pubblicazione:	Elettronica
Numero di pagine:	86
Informazioni aggiuntive:	Tesi secretata. Fulltext non presente
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici:	Tether Operations Limited (BVI)
URI:	http://webthesis.biblio.polito.it/id/eprint/35531

Modifica (riservato agli operatori)