|
APPLICATION OF MACHINE LEARNING FOR CHURN PREDICTION BASED ON TRANSACTIONAL DATA (RFM ANALYSIS)
|
|
|
Y. Aleksandrova
|
|
|
||
|
|
|
|
1314-2704
|
|
|
||
|
English
|
|
|
18
|
|
|
2.1
|
|
|
|
|
|
||
|
Machine learning covers a wide set of supervised and unsupervised algorithms for solving prediction, classification and anomaly detection problems. One of the areas of their applications is for customer churn prediction. To build a model for predicting the switching of customers, data scientists use different demographics, social, transactional, behavioural metrics and features. At the same time, most of the small Bulgarian companies still don?t have the needed versatile and complete customer data. They rely mainly on information provided by the ERP system that generates mostly transactional oriented data. Small and medium sized enterprises at this stage are not planning major investments in marketing research and additional customer related sources, and are limited to perform modelling and forecasting on transactional data.
The main goal of the current study is to propose a combination of RFM analysis and machine learning algorithms for churn prediction based on mainly transactional data. The dataset is extracted from ERP system of a regional concrete production company in Bulgaria. RFM scores are calculated for every customer for a period of 6 months before the end date of examination. The target value for prediction models is a churn metric indicating whether the customer has made a transaction in the next 6 months following the RFM analysis or not. Several machine learning algorithms has been applied such as Two-Class Boosted Decision Trees, Two-Class Neural Networks, Two-Class Decision Jungle, Two-Class SVM and Two-Class Logistic Regression. The experiments were performed in Azure Machine Learning Studio. Results showed that despite the limitations of RFM scores and metrics by using machine learning algorithms companies can predict with enough confidence the churning of their customers. The best model for churn prediction proved to be Two-Class Decision Jungle, Two-Class Boosted Decision Trees and Two-Class Neural Networks. There are no notable differences when using recency, frequency and monetary values instead their scores (R, F, M and RFM). |
|
|
conference
|
|
|
||
|
||
|
18th International Multidisciplinary Scientific GeoConference SGEM 2018
|
|
|
18th International Multidisciplinary Scientific GeoConference SGEM 2018, 02-08 July, 2018
|
|
|
Proceedings Paper
|
|
|
STEF92 Technology
|
|
|
International Multidisciplinary Scientific GeoConference-SGEM
|
|
|
Bulgarian Acad Sci; Acad Sci Czech Republ; Latvian Acad Sci; Polish Acad Sci; Russian Acad Sci; Serbian Acad Sci & Arts; Slovak Acad Sci; Natl Acad Sci Ukraine; Natl Acad Sci Armenia; Sci Council Japan; World Acad Sci; European Acad Sci, Arts & Letters; Ac
|
|
|
125-132
|
|
|
02-08 July, 2018
|
|
|
website
|
|
|
cdrom
|
|
|
510
|
|
|
machine learning; RFM analysis; churn prediction; classification and prediction; data mining
|
|