polito.it
Politecnico di Torino (logo)

convolutional neural networks for statistical post-processing of wind gusts speed

Francesco Guardamagna

convolutional neural networks for statistical post-processing of wind gusts speed.

Rel. Roberto Fontana, Elisa Perrone. Politecnico di Torino, Corso di laurea magistrale in Data Science and Engineering, 2022

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview
Abstract:

Convolutional networks for statistical post-processing of wind gusts predictions. The wind gusts can be defined as a brief increase of the average wind speed, for usually less than 20 seconds. These sudden bursts in the wind speed are critical because they can cause different types of damages. Our research project focuses on the estimation of the conditional probability P(Y |X), where X are the prediction for different types of weather variables, while Y are the wind gusts speed real observations. To take advantage of the spatial patterns, existing in our input data (we are working with gridded predictions from the Harmonie model), we decide to use different techniques based on convolutional architectures. Using deep learning methods we are also able to estimate more complex relations, using more than one input variable. The models we adopt are 3: -Binary classification approach. -Quantized Softmax. -Bernstein polynomials. The first approach converts our problem into a binary classification problem, assigning the samples related to a wind gusts speed above or under a certain threshold to 2 different classes, while the other 2 methods are able to estimate a continuous conditional distribution. The model we focus more on is the Binary classification approach, working with it we identify 2 main problems: -How we can reduce the number of features considered in our feature selection approach. -How we can deal with a training dataset characterized by a strongly imbalanced classes distribution. In order to understand if in our input data a certain amount of redundant information was present, we compute the pairwise person correlation coefficient between each couple of input features. This preliminary analysis has been performed in order to avoid considering 2 highly correlated features at the same time, during our cross-validation procedure. The threshold we decide to adopt to identify an event as extreme is a wind gusts speed of 16m/s. Applying this threshold to our training data, to identify the 2 classes, we obtain a training dataset characterized by a strongly imbalanced class distribution and of course, this largely affects the model performances. To mitigate the classes imbalance problem we try to apply 2 different downsampling techniques and we also try to assign a different weight to the contributions to the loss function of samples belonging to the 2 different classes. Obviously, a larger weight has been assigned to samples belonging to the minority class and a lower weight has been assigned to samples belonging to the majority class. After the feature selection and cross-validation procedure, which has been applied only for the binary classification approach, this model has been retained on the entire training data-set, using the best set of predictors and hyperparameters and the performances have then be evaluated on the entire test data-set. Different experiments have been performed using both the downsampling techniques and the weights method, to then compare the result. The 2 techniques we experimented with have mitigated the problem of the class imbalance that characterizes the training dataset, giving us better performances. We want to experiment using also the Quantized Softmax and the Bernstein Polynomials methods, using the best set of predictors and hyper-parameters identified during the feature selection and cross-validation procedure for the Binary classification approach. Finally, we want to compare the obtained results.

Relators: Roberto Fontana, Elisa Perrone
Academic year: 2021/22
Publication type: Electronic
Number of Pages: 57
Subjects:
Corso di laurea: Corso di laurea magistrale in Data Science and Engineering
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Ente in cotutela: Eindhoven University of Technology (TU/e) (PAESI BASSI)
Aziende collaboratrici: Eindhoven University of Technology TU/e
URI: http://webthesis.biblio.polito.it/id/eprint/22745
Modify record (reserved for operators) Modify record (reserved for operators)