polito.it
Politecnico di Torino (logo)

From Mineralogy to Petrography: Study on the Applicability of Machine Learning

Francesca Cibrario

From Mineralogy to Petrography: Study on the Applicability of Machine Learning.

Rel. Elena Maria Baralis, Andrea Pasini. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2018

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview
Abstract:

This thesis aims to understand if it is possible to apply machine learning techniques to predict the petrographical composition of a sample of soil given its mineralogical structure. The problem formalization requires predicting, from a set of continuous attributes (mineralogical constitution), a set of continuous features (petrographical constitution). The soil samples we have at disposal, for which both mineralogical and petrographical composition are known, present two challenges: their limited quantity and the non-uniform distribution of values of their features. These distributions are gaussian or exponential. We have addressed the prediction task with both regression and classification techniques. In the second case, we have discretized petrographical attributes using a custom methodology. These procedure maps the continuous domain of the feature into a discrete one represented by intervals of values. We have used these intervals as class labels. This allows classifying a sample by indicating its possible range of values. We applied Linear, Lasso, Ridge and Support Vector Machine (SVR) as regression techniques. The classification method used is the Decision Tree. Each attribute prediction is treated as a disjointed problem: all mineralogical attributes are used to predict one petrographical feature. Furthermore, we tested a custom third approach exploiting Non-Negative Matrix Factorization (NMF) technique, which jointly forecasts all petrographical features using all mineralogical attributes. This custom methodology is considered as a regression for what concerns performances evaluation. We have designed a way to evaluate the ability of regression models to predict rare values: it consists of two visual metrics called regression-precision and regression-recall inspired by precision and recall classification metrics. In order to compare regression and classifications predictions, we discretized continuous outcomes of regression techniques using the same intervals adopted for classification labels. After this operation, we confront predictions through metrics proper of classification: precision, recall and F-measure. Results show that models perform better on exponentially distributed petrographical attributes. Linear, Lasso and Ridge regressions are more promising than Decision Tree. NMF and SVR behave the worst. In the future we will focus on the possibility to exploit some relations that partially link petrographical composition with mineralogical one, with the help of a domain expert, trying in such a way to fulfill with domain knowledge the lack of data. However, although we still have not achieved final results, our preliminary analysis reveal that there is the possibility to successfully apply regression techniques for this kind of soil analysis.

Relatori: Elena Maria Baralis, Andrea Pasini
Anno accademico: 2018/19
Tipo di pubblicazione: Elettronica
Numero di pagine: 71
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/9544
Modifica (riservato agli operatori) Modifica (riservato agli operatori)