Customer Churn Prediction With Explainable AI (Live Demo)

📈 A production-style churn prediction system using the Kaggle Telco Customer Churn dataset, built with Python, SHAP explainability, and an interactive Gradio app deployed on Hugging Face.

Why Churn Prediction Matters

Customer churn prediction is one of the most common real-world applications of machine learning. For subscription businesses, telecom providers, or SaaS platforms, being able to predict which customers are likely to leave allows teams to act early and reduce attrition.

Instead of leaving this as a notebook experiment, I built a deployable churn analysis platform with:

Modular Python code (data, training, explainability, UI separated)
Support for multiple models (Logistic Regression, Random Forest, Gradient Boosting, XGBoost, CatBoost)
Business-friendly explainability using SHAP values
A fully interactive demo hosted on Hugging Face Spaces

Live Demo: Churn Prediction App

👉 Try the Churn Prediction App
👉 View Code on GitHub

Dataset: Telco Customer Churn

The application uses the Telco Customer Churn dataset from Kaggle:

7,043 customers
26.5% churn rate
Features: demographics, services, account information
Target variable: whether the customer churned (Yes/No)

System Design & Workflow

The churn prediction app is structured into five clear steps:

Data Ingestion & Preprocessing

Load CSV or auto-download Kaggle dataset
Handle missing values, categorical encoding, and feature engineering (tenure buckets, service counts)

Model Training

Choose between Logistic Regression, Random Forest, Gradient Boosting, XGBoost, or CatBoost
Configure cross-validation folds and ensemble size
Automated best-model selection

Evaluation Metrics

Accuracy, Precision, Recall, F1, ROC-AUC
Confusion Matrix and ROC curve visualizations

Explainability (SHAP)

Global feature importance across the dataset
Individual prediction breakdown (e.g., contract type, tenure, monthly charges, tech support)
SHAP waterfall view for single-customer explanations

Deployment

Built with Gradio 4.x for an interactive UI
Deployed on Hugging Face Spaces for instant public access

Code Structure

“`bash
telcochurnzeroshot/
├── data.py # data ingestion & preprocessing
├── train.py # model training & evaluation
├── explain.py # SHAP explainability
├── app_clean.py # Gradio UI
└── assets/ # saved ROC curves & confusion matrices

Example Churn Insights

From the trained models, consistent factors emerge:

Contract Type → month-to-month contracts show the highest churn rates
Tenure → longer-tenure customers are more likely to stay
Monthly Charges → higher monthly fees correlate with higher churn risk
Tech Support → customers with support are less likely to churn

These align with actionable business levers like annual plans, loyalty programs, pricing adjustments, and service improvements.

Deployment Options

Hugging Face Spaces → zero setup, public demo
Local run git clone https://github.com/ajaycyril/telcochurnzeroshot cd telcochurnzeroshot pip install -r requirements.txt python app_clean.py
Enterprise-ready → containerize with Docker, connect to enterprise data sources, integrate with BI dashboards

Roadmap

Add threshold tuning for precision vs recall tradeoffs
Implement cost-sensitive metrics to tie predictions to real $$ impact
Enable batch/API scoring for production deployment
Add a policy engine mapping churn scores → retention actions

Closing Thoughts

This project demonstrates how to take a standard churn dataset and transform it into a deployable churn prediction platform:

Modular Python ML pipeline
Interactive Gradio interface
Explainability via SHAP
Deployment on Hugging Face for fast access

It’s a reusable template for anyone looking to apply machine learning to churn analysis in production.