Politecnico di Torino (logo)

Health Records Analysis to build Final Patient dataset for dementia disease

Gaetano Ferrara

Health Records Analysis to build Final Patient dataset for dementia disease.

Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2020

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview

Health Records Analysis to build Final Patient dataset for dementia disease Big Data and data mining technologies are becoming an essential technology in everyday life. Banking use this tools to monitor the financial market through network activity monitors to minimize fraudulent transition. Automotive to reduce the implementation of opt having an underperforming market. The Healthcare sector, however, still has not benefitted from the wide-scale use of Big Data technologies. The reason for it is Healthcare sector has been traditionally slow in adopting new ICT techniques because most clinical data was stored in paper form. The distribution of Electronic health record (EHR) in healthcare facilities, it was an important step towards digitalization of health. Using the current hospital IT system is impossible to use data to make the clinical decisions because more of that has been collected in a wrong way or rather to achieve operational purposes. Dementia disease is one of the most widespread disease in the world among older people. Someone in the world develops dementia every 3 seconds. Around 50 million people have dementia, and there are nearly 10 million new cases every year. the most common form of dementia is Alzheimer disease that is associated with 60-70 \% of cases. The focus of this master thesis is dementia disease using anonymized data collecting various information during the disease progression. We present a data-driven approach to build the main dataset called Final Patient Data and with it define a set of three key performance indicators that it a preliminary step of a more complex machine learning analysis: 1) Multiple Filter Analysis 2) identification of strong correlation 3) rate of progression after the first diagnosis. In order to achieve the goal this thesis presents a method to be able, first of all, to extract processes of the patient from unstructured data sets. To structure the project and make it replicable, CRISP-DM has been adopted as a methodology to fulfill the goals. The thesis also presents methods to analyze, clean and prepare data to obtain structured datasets from which the mentioned KPIs can be measured.

Relators: Paolo Garza
Academic year: 2019/20
Publication type: Electronic
Number of Pages: 102
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: Universidad Politecnica de Madrid
URI: http://webthesis.biblio.polito.it/id/eprint/15274
Modify record (reserved for operators) Modify record (reserved for operators)