ML application development is really fun and these apps help data scientists and machine learning engineers to solve business problems.
Hello Aliens. In this, we will be going through the complete life cycle of a ML project. However, the main focus is on the front end which can be done very easily using Streamlit. This is one of the coolest platforms available which makes the life of a ML Engineer, data Scientist’s life easy. So, Lets begin the show.
The problem statement is Bank Churn attrition. The data set is from Kaggle and is available here. This has become very crucial now a days for the business. Let me throw some light on the what is the problem statement and why it needs to put to a check.
Customer attrition is the concern of most business that are involved in low switching cost markets. Banking industry can be one of the top most sufferers with a higher churn rate. the capability to predict that a customer is at high risk of churning, while there is still time to prevent this is a huge additional potential revenue source for every business. Some studies confirmed that acquiring new customers can cost five times more than satisfying and retaining existing customers. As a matter of fact, there are a lot of benefits that encourage the tracking of the customer churn rate as marketing costs to acquire new customers are high. Therefore, it is important to retain customers so that the initial investment is not wasted.
Basically this is of two steps.
- Building ML models and saving the model.
- using the saved model and creating a web application.
The ML models have been built and after hyper parameter tuning , I can be able to get a final model of XGBoost which have pretty impressive metrics. The code for this is been available here. So, please visit the link to get the complete idea of what is been done and how the hyper parameter tuning is done etc……
The metrics for the best model which is XGBoost(in my case) is as follows.
Now let’s analyse the classification report.
• True positives (TP): These are cases in which we predicted yes (they churn), and they do churn as well. — (1473)
• True negatives (TN): We predicted no, and they don’t churn. — (1550)
• False positives (FP): We predicted yes (they churn), but they don’t actually churn. — (132)
• False negatives (FN): We predicted no, but they actually churn. — (31).
In these types of scenarios, we need to have a close eye on False negatives, as that would be the most important figure which needs special attention. The metric for this model is showing way less false Negatives when compared with the other models. Also, we have good precision, recall values.
Now, we have finalised the best model and that needs to be pickled and save it for our further usage.
Now we use Streamlit to create UI.
Importing the packages.
Use the pickle files which were created earlier and load them over here.
Create variable selection with this simple code.
Create a function which can give you the corresponding probability of a customer being churned and here we use the two best algorithms in our case(XGBoost, Random Forest).
Create a button named “predict”. So, once the button is fired after giving all the inputs, the function should be called and based on the algorithm(in our case, we have 2), the corresponding probability needs to be obtained and if the probability obtained is greater than 0.5, the customer is going to churn and vice versa.
Save this in app.py or else any name. And then open the terminal and type “streamlit run app.py”. Then you will be prompted this.
Open your browser. And you can see the app which is ready to serve the business.
A simple and beautiful web app is created in less than 60 minutes. Now you can use Heroku for cloud deployment.
I will be posting deep learning applications and NLP tasks in my upcoming articles.