Zifan Chen
Enhanced Correction and Multi Language Support of Transcription on the "Tell your story" Digital Platform.
Rel. Luciano Lavagno, Gianpiero Cabodi. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2023
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (10MB) | Preview |
Abstract: |
This thesis represents a pragmatic endeavor, concentrating on the application of a dynamic programming algorithm to a specific feature of the web platform, [Tell your story](https://tiraccontounastoria.org/st/). "Tell your story" is a digital platform that serves a crucial societal function by preserving, sharing, and listening to individuals' lifetime memories. It is engineered to reestablish the significance of personal experiences, particularly for the elderly, by alleviating their isolation and aiding in memory recall. It also allows relatives and friends to explore and uncover specific details within these oral stories. The platform incorporates an interview system that automatically records stories via voice messages. Users can record their stories as brief or detailed voice messages, either in response to a series of questions or in an unrestricted format. As individuals engage with the stories, the audio is broadcasted on the website, and concurrently, an audio transcription is displayed to enhance comprehension and sharing. Each word in the transcription is timestamped, linking it to the corresponding segment in the audio. In this thesis, I have developed a series of methodologies using Django to facilitate two functionalities. Despite the primary intention of "Tell your story" being to preserve the authenticity and emotional resonance of spoken language rather than rectifying errors or peculiarities, we have still provided users with the capability to edit and correct the automated transcription of the voice message if they so desire. The first objective of this thesis is to empower the owner of story to rectify the machine-generated audio transcription, while automatically recalculating or preserving the timestamp of each word to the corresponding audio segment. The second objective is to enable the story teller or the platform's administrator to generate a translated rendition of the audio transcription, while striving to retain the alignment of each translated term as closely as possible to its original position in the audio. The core algorithm employed in this thesis is known as the Levenshtein Distance. This dynamic programming algorithm is frequently utilized to gauge the similarity between two strings or even broader sequences such as time series and DNA, and to ascertain the optimal alignment between two sequences. It finds extensive application in fields such as computational biology, machine translation, speech recognition, and named entity extraction. |
---|---|
Relatori: | Luciano Lavagno, Gianpiero Cabodi |
Anno accademico: | 2023/24 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 60 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Data Science And Engineering |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Aziende collaboratrici: | NON SPECIFICATO |
URI: | http://webthesis.biblio.polito.it/id/eprint/29452 |
Modifica (riservato agli operatori) |