Michele Pantaleo
GETALP-MISTRAL7B: A Clinical Large Language Model For Automated Discharge Documentation From Electronic Health Records.
Rel. Gabriella Olmo, Didier Schwab, Lorraine Goeuriot. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (12MB) | Preview |
Abstract
This work presents GETALP-Mistral7B, a clinical large language model (LLM) designed to automatically generate discharge documentation. Leveraging patients’ Electronic Health Records (EHRs), the model generates two central sections of a discharge summary: the Hospital Course (HC) and the Discharge Instructions (DI). EHRs are usually stored in forms or tables that differ across hospitals. To ensure interoperability across heterogeneous systems, EHRs were transformed into two task-specific textual formats: the Diary for generating the Hospital Course, and the Patient Summary for producing the Discharge Instructions. GETALP-Mistral7B is fine-tuned from Asclepius-Mistral-7B using 104,528 encounters from the Beth Israel Deaconess Medical Center (MIMIC-IV). Quantized Low-Rank Adaptation (QLoRA) is used to fine-tune the model separately for each section, yielding two specialized lightweight adapters while keeping the base model weights frozen.
GETALP-Mistral7B is benchmarked against models from the first shared task on clinical text generation: Discharge-Me!
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
Ente in cotutela
Aziende collaboratrici
URI
![]() |
Modifica (riservato agli operatori) |
