Politecnico di Torino (logo)

Prediction of faults in software programs using machine learning techniques

Ovidiu Birgu

Prediction of faults in software programs using machine learning techniques.

Rel. Maurizio Morisio. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2020

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview

Software development is a complex process, which can generate various kind of problems that are hard to identify during development. This master thesis is about the analysis of data generated by the software production process. The available data is about commits, releases, defects. The goals of this thesis are to identify the solved problems from past history, to create models for the issue type, the severity of the problem, cross project information and bug location. To achieve those goals, the thesis project was split as follows: The first stage of the work involved the development of a software module (named PyGitHub) that collects raw data from GitHub using the GitHub API and saves it to local relational database. Then, the module was used to scrape (download) various open source projects. The second stage consisted in creating a software module (named BugFinder) that reads the created databases, and iterates through all the commits in order to create the evolution history of the project from the files, extract the bugs, solutions to bugs, severity, and other meaningful This information was inserted in Neo4J (a graphical database), then using the cypher query available in Neo4J several CSV files containing datasets were generated. The third and final stage consisted in using machine learning techinique (Random Forest) and creating the models for issue type, severity and cross project issues. Finally, the datasets were given as input to the models, and the produced results were analyzed.

Relators: Maurizio Morisio
Academic year: 2019/20
Publication type: Electronic
Number of Pages: 87
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: EIS SRL
URI: http://webthesis.biblio.polito.it/id/eprint/18665
Modify record (reserved for operators) Modify record (reserved for operators)