Politecnico di Torino (logo)

Learning Generalized Linear Models with Superstatistical Covariates

Leonardo Defilippis

Learning Generalized Linear Models with Superstatistical Covariates.

Rel. Alfredo Braunstein, Bruno Loureiro. Politecnico di Torino, Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi), 2023

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview

While machine learning is making significant advances in recent years, the problem of its theoretical understanding still remains an open challenge. One key aspect is the ability to predict the generalization of learning algorithms' predictions, which is crucial for assessing their reliability in various domains such as medicine, biology, finance, and signal processing. Previous studies in supervised learning have considered, in the vast majority of cases, Gaussian distributed covariates. However, in practical applications of machine learning, the data distribution may diverge from Gaussianity in many ways, such as fluctuations, heavy-tails or structured patterns. This work aim to investigate, employing the heuristic replica method from statistical physics, the supervised learning of generalized linear models when the covariates are distributed according to a superstatistical model, meaning that each covariate is drawn from a Gaussian distribution with random covariance following a generic probability distribution ρ. The regime of our interest is the one of finite sample complexity, which is the ratio of sample size with respect to the covariates' size, with both of them taken infinitely large. The choice of ρ can affect drastically the resulting covariates' distribution, which may present heavy-tails or even infinite variance. In particular we derive equations to predict the minimal estimation error that is achievable by any algorithm given the data, studying the Bayes optimal setting for this problem. We compare these results to the ones of empirical risk minimization. We then compute the leading order of the estimation error curves with respect to the sample complexity, showing that it does not depend on the choice of ρ and it is compatible with the Gaussian covariates' case. Our findings align with the Gaussian universality principle, which has been proven rigorously for several problems, stating that non-Gaussian distributed data can be effectively described by Gaussian distributions with matching first two moments.

Relators: Alfredo Braunstein, Bruno Loureiro
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 42
Corso di laurea: Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi)
Classe di laurea: New organization > Master science > LM-44 - MATHEMATICAL MODELLING FOR ENGINEERING
Ente in cotutela: École Normale Supérieure (FRANCIA)
Aziende collaboratrici: Ecole Normale Superieure
URI: http://webthesis.biblio.polito.it/id/eprint/27743
Modify record (reserved for operators) Modify record (reserved for operators)