polito.it
Politecnico di Torino (logo)

Extraction, indexing and analysis of Ethereum smart contracts data

Davide Aimar

Extraction, indexing and analysis of Ethereum smart contracts data.

Rel. Valentina Gatteschi, Mariusz Nowostawski. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2023

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (5MB) | Preview
Abstract:

Blockchain technology has gained popularity in the last decade. New protocols allow developers to build decentralized applications thanks to the usage of smart contracts. Ethereum is one of the most popular blockchain network of this kind. Every twelve seconds, a new block is appended to this chain. Each block contains information that describes a market worth billions of dollars. Since Ethereum is a permissionless blockchain, this data is publicly available to anyone, but without proper tools, it is not easy to be analyzed. This master’s thesis focuses on extracting semantics from raw Ethereum data and making it easily available to users by indexing it with Dgraph, an open-source distributed graph database. A review of the state-of-the-art tools showed that relevant work in this field has been done by private companies whose source code and methodology are not available. Many open-source and public projects resulted in being outdated or slow. This poses the risk of centralizing access to blockchain data in the hands of a few companies. Part of this master’s thesis was dedicated to analyzing the semantics that can be extracted from the blockchain and building a data schema around it that is optimized for graph databases. A custom software, called eth2dgraph, was developed to perform the extraction of data. It is an open-source tool written in Rust that maps Ethereum data to Dgraph format. It integrates a decompiler to extract and index the ABI of smart contracts. Eth2dgraph was developed with a focus on performance. This was done to scale the extraction process to all the history of the Ethereum blockchain. At the end of the thesis, the data indexed in Dgraph has been analyzed to show the current state of the Ethereum blockchain. This work provides an alternative solution to the problem of blockchain data analysis. The open-source nature of the project allows other developers to build on top of it. Performing the actual extraction and indexing came close to hitting the limit of what can be done on a single machine. This highlights the fact that, in the future, distributed approaches will be the only possible way of handling the increasing amount of data that comes from the Ethereum blockchain. This is already evident with layer 2 protocols, which are generating data at a faster pace than Ethereum.

Relators: Valentina Gatteschi, Mariusz Nowostawski
Academic year: 2023/24
Publication type: Electronic
Number of Pages: 124
Subjects:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Ente in cotutela: NORWEGIAN UNIVERSITY OF SCIENCE AND TECHNOLOGY (NTNU) (NORVEGIA)
Aziende collaboratrici: NTNU
URI: http://webthesis.biblio.polito.it/id/eprint/28450
Modify record (reserved for operators) Modify record (reserved for operators)