Customer Churn Prediction Using Machine Learning

  • D. Deepika, Nihal Chandra


Customers are the most important assets in any industry since they are considered as the main profit source. Companies are working hard to survive in today’s competitive market depending on multiple strategies. Three main strategies have been proposed to generate more revenues: (1) acquiring new customers, (2) upselling the existing customers, and (3) increasing the retention period of customers. However, comparison of these strategies has shown that retaining an existing customer costs much lower than acquiring a new one, in addition to being considered much easier than the upselling strategy. To apply the third strategy, companies have to decrease the potential of customer’s churn.Customer churn is a term that refers to the loss of a client or customer—that is, when a customer ceases to interact with a company or business. Similarly, the churn rate is the rate at which customers or clients are leaving a company within a specific period of time. Customer churn is one of the most important concerns for large companies. Due to the direct effect on the revenues of the companies, especially in the telecom field, companies are seeking to develop means to predict potential customers that could churn. Therefore, finding factors that increase customer churn is important to take necessary actions to reduce this churn.

A churn prediction model is developed in this project which can assist companies to predict customers who are most likely to churn. It uses machine learning techniquessuch as Logistic Regression, Decision Trees, K-Nearest Neighbors and Support Vector Machine algorithmsto identify the primary determinants of customer churn along with the algorithm fit for such predictions.The dataset contains demographic details of customers, their total charges and they type of service they receive from the company. Itcomprises of churn data of over 7000 customersspread over 21 attributes obtained from Kaggle. Further on this investigation, the usage of the above mentioned algorithms is described for predicting customer churn.

To conclude, the results of the algorithmsfor predicting customer churn are outlined in the form of accuracy, recall score, precision, f1scoreand kappa metrics using interactive graphs. The results show that month-to-month contracts and the tenure of customers are most crucial attributes in predicting customer churn and an accuracy of 80.2% for Logistic Regression was the best.