Enhancing Target Accuracy with Machine Learning
Projects | | Links:
Use different classification models to predict what customers are likely to sign up for Delivery Club.
In this project, I built a model that predicts customers’ likelihood to sign up for a grocery chain’s delivery club initiative. The end goal of the project is to generate a list of customers (for the grocery chain) that are most likely to be receptive to mail advertisements the grocery chain will send to promote their delivery club initiative. This will save the company some costs as marketing efforts would be targeted to where they would have the most impact.
The dataset was mostly clean and missing values were excluded since they constitute less than 1% of the data. Four classification models were used to generate predictions - Logistic Regression, K Nearest Neighbors (KNN), Decision Tree and Random Forest. Of all four models, the Random Forest model gave the best F1-score (0.92) with an accuracy of 95.7%. Feature and permutation importance plots show that “distance from store” is the primary driver for delivery club signups - that is, the farther people are from the store, the more likely they are to sign up for a delivery club solution to save them travel.
In the Full Analysis, I go through the details of cleaning, preprocessing, modelling and model assessment (for the Logistic Regression and Decision Tree models); the Jupyter Notebook contains the full code for all four (4) models used in this project.