Davide Elio Stefano Demicheli
Real-Time Object Detection and Gaze Tracking for Automated Pilot Monitoring.
Rel. Tatiana Tommasi. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025
| Abstract: |
Pilot performance is a critical factor in aviation safety, as modern cockpits require operators to process large amounts of information while managing multiple parallel tasks. Under high workload, distraction, or stress, this reliance on human vigilance can lead to lapses that compromise operational reliability. This motivates the development of intelligent assistance systems capable of monitoring pilot actions in real time and providing continuous, objective support. In this thesis, we present and evaluate an integrated pipeline that combines object detection with gaze tracking to assess pilot interaction with cockpit instruments. While the system is applied to a checklist verification task in a general aviation cockpit simulator, the approach is designed to be generalizable to more complex environments such as airliner cockpits, where automated monitoring could provide substantial benefits, particularly in single-pilot operations. The experimental environment is based on X-Plane 11, a flight simulator providing a realistic and controllable cockpit setting. A tailored dataset was constructed by recording cockpit videos, segmenting relevant instruments offline with Segment Anything Model 2 (SAM2), and converting the resulting masks into You Only Look Once (YOLO)-compatible annotations. A Common Objects in Context (COCO)-pretrained lightweight YOLO detector (YOLO11n) was fine-tuned on this dataset under different experimental conditions to determine the best trade-off between detection accuracy, training duration and inference speed, with the idea of optimizing real-time performance. The trained models were then integrated with gaze data obtained from the Tobii Pro Glasses 3 eye tracker and simulation data retrieved using the Open-Source NASA X-Plane Connect plugin. Synchronizing gaze with detections enabled the identification of the cockpit instruments observed by the pilot instant by instant and, cross-checking with simulation data, the tracking of checklist progression. To prove the usability of the pipeline, a graphical interface was developed to visualize detections, gaze overlays, and checklist state in real time. Results show that the trained lightweight YOLO models run at real-time speed, with inference times of ≈5 ms while detecting up to 32 objects per frame on an RTX A6000 GPU. On the test set, the best-performing model achieved a recall of ≈97% and a mean Average Precision at IoU=0.5 (mAP@50) of ≈96%. The experiments also underline the practical challenges of combining gaze data with object detection, particularly with respect to gaze precision, bounding-box stability, and small objects recognition. This work contributes a proof-of-concept for gaze-driven pilot monitoring and lays the foundation for multimodal extensions. Future research directions include automatic gauge and display reading through Optical Character Recognition (OCR), speech integration, and broader applications in domains where procedural compliance and human attention play a critical role, such as medicine, manufacturing, and transportation. |
|---|---|
| Relatori: | Tatiana Tommasi |
| Anno accademico: | 2025/26 |
| Tipo di pubblicazione: | Elettronica |
| Numero di pagine: | 69 |
| Informazioni aggiuntive: | Tesi secretata. Fulltext non presente |
| Soggetti: | |
| Corso di laurea: | Corso di laurea magistrale in Data Science And Engineering |
| Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
| Aziende collaboratrici: | Politecnico di Torino |
| URI: | http://webthesis.biblio.polito.it/id/eprint/37843 |
![]() |
Modifica (riservato agli operatori) |



Licenza Creative Commons - Attribuzione 3.0 Italia