Michele Merico
PDF Forensics and Attack Analysis: Development of a Unified Investigation Tool.
Rel. Andrea Atzeni, Paolo Dal Checco. Politecnico di Torino, Corso di laurea magistrale in Cybersecurity, 2025
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) | Preview |
|
|
|
Archive (ZIP) (Documenti_allegati)
- Altro
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (2MB) |
| Abstract: |
Nowadays, the number of daily cyberattacks is extremely high. As a mitigation measure and to shed light on such incidents, digital forensics often plays a crucial role. Digital forensics is the process of identifying, collecting, preserving, analyzing and presenting digital evidence to support investigations and legal proceedings. A significant number of the attacks that digital forensics must deal with exploit the Portable Document Format (PDF). For this reason, understanding how to prevent and analyze the misuse of PDF files has become increasingly important. Despite the existence of several tools for PDF analysis, current solutions present important limitations for forensic usage. In many real-world scenarios, PDFs are not standalone files but are transmitted as email or PEC attachments, making it essential to analyze not only their internal structure but also their associated transmission metadata in a forensically sound manner. Most existing tools either focus on static structure inspection or on malware detection, but they lack integration with email metadata, do not ensure forensic soundness, do not support the analysis of embedded files within the original PDFs and cannot automatically process PDFs embedded in emails or PECs. Moreover, few of them can verify embedded digital signatures or identify objects that have been modified after signing or only partially covered by digital signatures in a forensically reliable way. These shortcomings make it difficult for investigators to safely extract, analyze, and correlate PDF evidence while maintaining data integrity and traceability. This thesis addresses these issues by developing an integrated tool for PDF forensic analysis called foredf. The work first studies the structure of PDFs, emails and PECs, then evaluates existing tools, identifying peepdf as the most suitable starting point. Since no existing solution could automatically parse and analyze PDFs from emails or PECs while preserving metadata, a new parsing module was implemented. Furthermore, peepdf was extended to support verification of embedded objects and digital signatures. The entire toolset was executed in a containerized environment to ensure forensic soundness, prevent any alteration of evidence, and automatically generate human-readable preliminary reports, serving as a foundation for subsequent full forensic reports. The tool was evaluated on a variety of PDFs, including those retrieved from emails and PECs containing fake or partial digital signatures and embedded content, analyzed both statically and partially dynamically. The results demonstrate that foredf allows users with moderate technical skills to verify PDF integrity, while providing forensic experts with detailed object-level information and metadata correlation. The use of containerization ensures secure handling of potentially malicious PDFs, reducing risks for analysts and systems. Additionally, foredf may support investigators in tracing the origin of attacks or harmful attachments without interacting directly with the files, further improving safety and efficiency in forensic investigations. With further refinements, foredf could become a valuable tool for forensic professionals, supporting PDF investigations both as standalone files and when embedded in emails or PECs. It may also help non-expert users make informed decisions about whether to open received PDF documents, making it a potentially powerful tool not only for forensic analysis but also for proactive cybersecurity. |
|---|---|
| Relatori: | Andrea Atzeni, Paolo Dal Checco |
| Anno accademico: | 2025/26 |
| Tipo di pubblicazione: | Elettronica |
| Numero di pagine: | 117 |
| Soggetti: | |
| Corso di laurea: | Corso di laurea magistrale in Cybersecurity |
| Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
| Aziende collaboratrici: | Politecnico di Torino |
| URI: | http://webthesis.biblio.polito.it/id/eprint/38692 |
![]() |
Modifica (riservato agli operatori) |



Licenza Creative Commons - Attribuzione 3.0 Italia