Adaptive GUI Test Evolution and Oracle Maintenance

Alessandro Poletti

Adaptive GUI Test Evolution and Oracle Maintenance.

Rel. Riccardo Coppola, Tommaso Fulcini. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (2MB) | Preview

Abstract:	The mobile applications industry has grown significantly in the last few years, bringing companies to launch their own applications to the market as fast as possible, and support them with continuous updates. In this context, one of the primary concerns is the need for a thorough testing phase, which is often performed superficially or neglected altogether, being a costly and time-consuming activity. GUI testing in particular plays a crucial role, since it allows the simulation of direct user interaction. As an application evolves, its GUI visual appearance, internal structure, and properties often change over time. This may mean that tests written for one version of the application could fail when executed on a subsequent version of the same application, not because of a malfunction of the application, but because the tests themselves are outdated. This issue is often described as fragility in software testing literature. As a consequence, developers should dedicate additional effort when a test fails after an update, understanding whether the test fails because of an actual defect of the application, or because it is outdated. In the latter case, the test needs to be manually repaired for the updated application. Over the years, many solutions have been proposed to mitigate the amount of effort needed for test repair. Large Language Models (LLMs), with their ability to understand and generate natural language and other types of content to perform a wide range of tasks, represent a promising solution for addressing mobile GUI test repair challenges. The goals of this study are: first, determine what are the most common causes for mobile GUI tests to break from one application version to another; second, understand how an LLM would help to reduce the effort needed for mobile GUI test repair; third, compare the performance of an LLM-based repair approach to a state-of-the-art repair tool, namely Healenium-Appium. To do so, we gathered 19 real-world Android applications with 61 broken GUI tests, and analyze the causes for them to break. Then, we developed an LLM-based approach for mobile GUI test repair, that leverages on multiple interaction with an LLM to generate repaired tests. Finally, the performance of our approach is compared to the baseline of Healenium-Appium. In the experimental evaluation, our LLM-based approach was able to repair 45 of the 61 tests (73.8%) after only one interaction with the LLM, and 56 of the 61 tests (91.8%) after two or more interactions, outperforming the Healenium-Appium baseline by generating 50.8% and 68.8% more repaired tests respectively. These results demonstrate that LLM-based repairs represent a very promising approach in addressing the fragility of mobile GUI tests. Our LLM-based approach is able to significantly reduce the manual effort required to maintain test suites as applications evolve, while achieving higher repair success rates than existing automated solutions. The integration of LLMs into mobile testing workflows can greatly help developers enhance test reliability and maintenance scalability.
Relatori:	Riccardo Coppola, Tommaso Fulcini
Anno accademico:	2025/26
Tipo di pubblicazione:	Elettronica
Numero di pagine:	54
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici:	NON SPECIFICATO
URI:	http://webthesis.biblio.polito.it/id/eprint/38662

Modifica (riservato agli operatori)