polito.it
Politecnico di Torino (logo)

Natural Language Processing based Automatic Ingestion of Automotive Data

Marco Cirone

Natural Language Processing based Automatic Ingestion of Automotive Data.

Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025

[img] PDF (Tesi_di_laurea) - Tesi
Accesso riservato a: Solo utenti staff fino al 11 Aprile 2028 (data di embargo).
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (5MB)
Abstract:

In order to develop automated processes involving the elaboration of textual data, Natural Language Processing (NLP) provides many different types of tasks. One of them is Named Entity Recognition (NER), which aims to identify and classify certain entities inside the text. Pretrained NER models can offer excellent performances on generic data, but they could be not as effective on specialized data, such as company data, as they show unique features and patterns. In order to overcome this obstacle, a possible solution could be to create a specialized model, trained specifically on the company dataset. The goal of this thesis is to apply different models and methodologies in order to develop a Machine Learning model able to identify key data points of cars (such as make, model and transmission type) using the version names of the vehicles. The work will be devided in 3 main phases: analysis of the dataset and the logs obtained from the precedent model the company already implemented, selection and training of the models, evaluation of performances on unknown data.

Relatori: Paolo Garza
Anno accademico: 2024/25
Tipo di pubblicazione: Elettronica
Numero di pagine: 66
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Jato Dynamics Italia
URI: http://webthesis.biblio.polito.it/id/eprint/35348
Modifica (riservato agli operatori) Modifica (riservato agli operatori)