polito.it
Politecnico di Torino (logo)

Tracing methodologies and tools for Artificial Intelligence and Data Mining Java applications

Roberto Stagi

Tracing methodologies and tools for Artificial Intelligence and Data Mining Java applications.

Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2020

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (6MB) | Preview
Abstract:

Supercomputing and Artificial Intelligence are among the most important outcomes of the last decades. Both of them have been behind the scenes of many recent discoveries, and together with most of the applications in general, have been switching from a sequential paradigm to parallel and distributed approaches, that best fit the new hardware. The High Performance Computing (HPC) discipline is at the heart of these developments. In this context, the Java programming language plays a marginal role. However, Java is still in high demand, it is employed in AI and runs effectively on supercomputers. Even if a smaller set of programmers use it for HPC applications, its influence in the AI world is not negligible and it deserves a larger attention to the tools that support its development in such environment. Parallel program performance analysis is concerned with achieving efficient utilisation of system resources. One common technique is to collect trace data and then analyse it for possible causes of poor performance. A department of the BSC, the Performance Tools department, is in charge of developing this kind of tools. The thesis has been developed as an intern in this department, and for this reason the base of the work is going to be on the two main tools developed there: Extrae and Paraver. The former is the program needed to extract information, while the second one to show them. The main focus of this thesis is on Extrae. The state of the art of Extrae's instrumentation for Java is poorly implemented. Out of some basic features to trace basic thread events, using the instrumentation of pthreads (on which all Java threads are mapped), it does not give much valuable information. A study on the state of the art is covered in chapter 2. Since Extrae is implemented in C, generating probes and wrappers would not be an issue for other C-implemented programs. In chapter 3 there is an overview of the approaches that can be used to generate the traces for a Java program. The approach that is then developed is going to be based on an event-driven platform offered by the JVM (the JVM TI), united to the extension for the Java language that implement aspect-oriented programming paradigm (AspectJ). The development of this platform follows in chapter 4 and chapter 5, and will be applied on a real Java framework: Hadoop. This study is carried out in chapter 6, where also discussions on the whole work of the thesis can be found.

Relatori: Paolo Garza
Anno accademico: 2019/20
Tipo di pubblicazione: Elettronica
Numero di pagine: 105
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Ente in cotutela: UNIVERSITAD POLITECNICA DE CATALUNIA - FIB (SPAGNA)
Aziende collaboratrici: BARCELONA SUPERCOMPUTING CENTER
URI: http://webthesis.biblio.polito.it/id/eprint/15276
Modifica (riservato agli operatori) Modifica (riservato agli operatori)