
Tina Mohammadlavasani
From Data to Decisions: Enhancing Monthly Profit Predictions through XGBoost.
Rel. Alessandro Savino, Roberta Bardini. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025
Abstract: |
This thesis, "From Data to Decisions: Enhancing Monthly Profit Predictions through XGBoost," addresses the critical need for accurate financial forecasting within multinational electronics manufacturers like Carlo Gavazzi Automation. Traditional forecasting struggles with complex business data, including seasonality, fluctuations, and missing values. The primary goal is to develop a robust, time-aware monthly gross profit forecasting model using XGBoost, aiming to significantly outperform simpler baselines. The methodology begins with systematic data preparation from Carlo Gavazzi Automation's granular sales records (2016-2024). Key cleaning steps included handling missing values and standardizing suspicious entries. Data types were converted: 'Fiscal Period' to datetime64[ns] and numerical columns to float64. For feature engineering, granular data was aggregated to monthly 'product_group' totals, transforming it into a time series. Time-based features, lag features, and a 3-month rolling mean of gross_profit were computed. To enhance robustness, gross_profit was filtered to be positive and below a 300,000 threshold, managing returns and outliers. 'product_group' was handled using XGBoost’s native categorical support. XGBoost Regressor was selected for its high performance, flexibility, and efficiency. The experimental setup involved a strict chronological 80/20 train-test split. Hyperparameter tuning was rigorously performed using GridSearchCV with TimeSeriesSplit (n_splits=5), ensuring temporal order. Performance was assessed using Mean Absolute Error (MAE), R-squared (R^2), Symmetric Mean Absolute Percentage Error (SMAPE), and MAE / Average Profit. Results demonstrated clear progression: a base XGBoost model outperformed simple baselines. Data cleaning, while improving MAE, showed a trade-off. Frequency encoding for product_group yielded no further gains. The most significant improvement came from hyperparameter tuning, with the optimized model achieving notably improved MAE and SMAPE, confirming superior accuracy and significant error reduction. In conclusion, the fine-tuned XGBoost model offers a robust, time-aware predictive capability for monthly gross profit, representing a substantial advancement for Carlo Gavazzi Automation’s planning needs. Future work includes exploring advanced models (e.g., LSTM, Transformer), external data integration, granular forecasting, and model deployment. |
---|---|
Relatori: | Alessandro Savino, Roberta Bardini |
Anno accademico: | 2024/25 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 62 |
Informazioni aggiuntive: | Tesi secretata. Fulltext non presente |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Data Science And Engineering |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Aziende collaboratrici: | GRUPPO IVA CARLO GAVAZZI ITALIA |
URI: | http://webthesis.biblio.polito.it/id/eprint/36354 |
![]() |
Modifica (riservato agli operatori) |