LLM-based Generation and Evaluation of UML Class Diagrams

Roberta Soldati

LLM-based Generation and Evaluation of UML Class Diagrams.

Rel. Riccardo Coppola, Giacomo Garaccione. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Gestionale (Engineering And Management), 2025

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (1MB) | Preview

Abstract:	In the digital era, information systems play an essential role in supporting the operational process of modern organizations. Among the critical phases of software development, conceptual modeling remains one of the most decisive, especially during the early stages, when system requirements are still being explored and formalized. The heart of conceptual modeling lies in the use of diagrammatic notation, with the Unified Modeling Language (UML) class diagrams emerging as one of the most widely adopted standards for representing the static structure of object-oriented systems. These diagrams enable designers to abstractly represent classes, their attributes, and interrelations such as associations, generalizations, and aggregations, providing a visual formalism that facilitates analysis, communication, and software specification. However, despite their expressive power and standardization, UML class diagrams pose considerable cognitive challenges for learners and inconsistencies for evaluators. In educational contexts, instructors frequently confront the difficulty of manually assessing diagrammatic solutions submitted by students, where equally valid structural variations can lead to subjectivity in grading. From an industrial standpoint, early-stage software modeling is often constrained by time and the cost of human effort especially when modeling from informal requirement specifications. This thesis tries to answer the question of to what extent artificial intelligence can assist or even automate the generation and evaluation of UML class diagrams from textual requirements. In recent years, the rise of Large Language Models (LLMs), based on transformer architectures, allowed us to generate syntactically correct and semantically plausible outputs in a variety of formats, from natural language to domain-specific representations like JSON one. However, their capabilities are deeply influenced by the formulation of the input, commonly known as the prompt, which defines the task context, constraints, and expected structure. This has led to the emergence of prompt engineering as a critical practice for aligning LLM outputs with specific domain goals, especially when precision and interpretability are critic. Hence, this thesis investigates the feasibility and effectiveness of employing LLMs for the automatic generation and evaluation of UML class diagrams. Specifically, it explores whether and how prompt-engineered LLMs can generate class diagrams that comply with syntactic, semantic, and pragmatic standards of quality, and whether these outputs can be assessed through replicable and objective evaluation mechanisms. The research is structured around a twofold objective: designing and optimizing a prompting strategy that enables LLMs to generate UML diagrams in JSON format, fully compatible with the Apollon and UML-Modeler tools and developing an evaluation pipeline that can assess the correctness and similarity of such diagrams, with respect to the reference ones, through rubric-based scoring. The findings confirm the potential of LLMs in educational modeling tasks, although limitations remain. Diagram generation is sensitive to prompt formulation, and rubric-based evaluations, while practical, introduce subjectivity. Future work could involve extending to other modeling languages, incorporating visual assessment, and averaging multiple human evaluations to reduce bias and enhance reliability.
Relatori:	Riccardo Coppola, Giacomo Garaccione
Anno accademico:	2024/25
Tipo di pubblicazione:	Elettronica
Numero di pagine:	79
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Ingegneria Gestionale (Engineering And Management)
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-31 - INGEGNERIA GESTIONALE
Aziende collaboratrici:	NON SPECIFICATO
URI:	http://webthesis.biblio.polito.it/id/eprint/35962

Modifica (riservato agli operatori)