Politecnico di Torino (logo)

Feature importance analysis for User Lifetime Value prediction in games using Machine Learning: an exploratory approach

Gilberto Vilar De Carvalho Santos

Feature importance analysis for User Lifetime Value prediction in games using Machine Learning: an exploratory approach.

Rel. Barbara Caputo, Giuseppe Rizzo. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2020

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview

The main characteristic of a freemium business model is that only a small share of users drives the largest part of revenue for the company, financing the product for the rest of the users. In the growing gaming industry this scenario becomes even more critical, since anyone that has a cellphone and internet connection can access thousands of Free-to-Play games. Therefore, firms need to perform two difficult tasks in order to start or keep revenue growth: first, find and attract potentially high value users; second, retain and up-sell such valuable users. Customer Lifetime Value (LTV) is the most used metric to identify high value users and drive marketing budget in a business decision-making scenario. Its prediction became a crucial part of game companies daily work, since players generate an exceptionally rich dataset that can be used to understand and predict their purchasing behavior over time. In this work a complete exploratory analysis of a real dataset from a huge Brazilian game company is performed. The main features to be analyzed were carefully selected by the author and domain experts from the company. The result of this analysis is a set of actionable insights about each selected feature, a complete preprocessed dataset ready to be used in a set of Machine Learning algorithms and a feature importance analysis based on a simple Random Forest Regression. The study shows that static features like country, platform, source, etc., have low correlation with long term LTV but provided interesting insights to the company. The most important features for LTV prediction are derived from purchase-related ones (number of purchases and net revenue per purchase), which is in line with literature. Event-related features are complex and showed low correlation with the target variable, but still contains other potential features that could be analyzed.

Relators: Barbara Caputo, Giuseppe Rizzo
Academic year: 2020/21
Publication type: Electronic
Number of Pages: 93
Corso di laurea: Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica)
Classe di laurea: New organization > Master science > LM-25 - AUTOMATION ENGINEERING
Aziende collaboratrici: FONDAZIONE LINKS
URI: http://webthesis.biblio.polito.it/id/eprint/16742
Modify record (reserved for operators) Modify record (reserved for operators)