polito.it
Politecnico di Torino (logo)

Computational Intelligence for Modeling Drift and Noise in Mass Spectra for Detecting SARS CoV2 from Patients' Breaths.

Muhammad Aamir Ali

Computational Intelligence for Modeling Drift and Noise in Mass Spectra for Detecting SARS CoV2 from Patients' Breaths.

Rel. Giovanni Squillero, Nicolo' Bellarmino, Riccardo Cantoro. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2023

Abstract:

Computational Intelligence for Modeling Drift and Noise in Mass Spectra for Detecting SARS CoV2 from Patients' Breaths. The COVID-19 pandemic is a global crisis that requires effective control measures to prevent further spread of the virus. The virus is primarily spread through droplets, making the identification of clusters of positive cases a crucial step in limiting its transmission. Mass testing is an effective way to quickly identify positive cases, but many available tests are invasive and may require medical assistance, which can be stressful or dangerous. Therefore, there is a need for non-invasive testing methods that can aid in the prompt and accurate diagnosis of COVID-19, while reducing the risk of transmission and minimizing patient discomfort. NanoTech Analysis S.r.l. (NTA) has suggested a non-invasive approach for COVID-19 testing. We need to provide a sample of the patient’s breath to produce a mass spectrum, which is then analyzed using machine learning algorithms. In this study, we utilize raw data provided by NTA to create a dataset. We then conduct an examination of this dataset and present the results. The purpose of this examination is to identify potential challenges in the dataset and evaluate various solutions to overcome them. After creating a tabular dataset, we will use machine learning techniques to analyze it. The raw data collected by the company includes multiple patient samples and their corresponding mass spectra. Typically, a gaseous sample is analyzed using Gas Chromatography Mass Spectrometry (GC-MS), a process that takes a significant amount of time due to the individual evaluation of each compound in the sample. However, for mass screening purposes, this approach is not feasible due to the time it requires. The NTA instrument, on the other hand, is able to analyze the gaseous sample and provide results within 15 minutes. The output from the instrument gives a general overview of the quantity of compounds present in the sample through a series of mass spectra. Data acquisition from dirty mass spectra can be a challenging task due to the presence of noise and artifacts, which can interfere with the signals of interest and decrease the accuracy and reliability of the results. To overcome these challenges and generate reliable data, various data cleaning methods such as smoothing, baseline correction, and peak detection can be applied to remove the noise and artifacts. This can help to identify and quantify the compounds of interest more accurately and with greater confidence. Statistical methods have been utilized to find relationships between different ions present in the mass spectra, in addition to the data cleaning process. Techniques such as principal component analysis, clustering, and regression analysis have been applied to identify patterns and correlations within the data. By comprehending the relationships between the ions, insights into the composition of the sample and the identification of potential biomarkers or other compounds of interest have been achieved. In general, the work on data acquisition and analysis from mass spectra has played a critical\ role in generating reliable data and advancing our understanding of the compounds present in complex mixtures. Through the combination of data cleaning and statistical analysis techniques, meaningful information has been extracted from dirty mass spectra, leading to insights into the underlying chemistry of the samples. This has resulted an important implication.

Relatori: Giovanni Squillero, Nicolo' Bellarmino, Riccardo Cantoro
Anno accademico: 2022/23
Tipo di pubblicazione: Elettronica
Numero di pagine: 72
Informazioni aggiuntive: Tesi secretata. Fulltext non presente
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: NanoTech Analysis srl
URI: http://webthesis.biblio.polito.it/id/eprint/26841
Modifica (riservato agli operatori) Modifica (riservato agli operatori)