Gennaro Petito
Pipeline for the automatic population of an automotive database: from retrieval to parsing of textual descriptions.
Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2022
Abstract
Natural language processing has proven to be very effective in the automatization of many processes, especially since the introduction of the Transformer and of large language models, like BERT, that are able to produce contextual embeddings and can be finetuned to perform new tasks on specific datasets. JATO, like other companies, relies on human intervention to retrieve and extract data from text to populate certain databases. This is especially done when the extraction requires natural language understanding and not a simple rule-based system. This task can still be quite repetitive and time-consuming, so in this thesis we implement a pipeline to automate the population of a car's optional equipment database.
We make use of two different sources of information provided by car manufacturers: configurators and brochures
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Informazioni aggiuntive
Corso di laurea
Classe di laurea
Aziende collaboratrici
URI
![]() |
Modifica (riservato agli operatori) |
