Customer Segmentation Analysis
Data Description
The dataset includes the following features for each customer order:
order_item_quantity: The number of items ordered.sales: Total sales value of the order.benefit_per_order: Profit or loss from the order.order_item_discount: Discount applied to the order items.order_item_total: Total value of the order after discount.OrderDuration: Time taken from order placement to shipping.order_duration: An alternative measure of the order processing time.
Clustering Process
The clustering process involved the following steps:
- Data Preprocessing: Data was cleaned, and features were scaled to prepare for clustering.
- Elbow Method: Applied to determine the optimal number of clusters, which was found to be 4.
- K-Means Clustering: The algorithm was used to segment the customer data into 4 distinct groups.
- Analysis: Post-clustering analysis was performed to understand the characteristics of each cluster.
Cluster Interpretation
The clusters were interpreted based on their centroid characteristics:
- Segment 0: ‘Casual Shoppers’ - lower in quantity and sales, less sensitive to discounts.
- Segment 1: ‘Promotion Sensitive Buyers’ - attracted by discounts, with higher sales variability.
- Segment 2: ‘High-Spending Buyers’ - characterized by high sales and order totals.
- Segment 3: ‘Bulk Shoppers’ - high quantity but moderate sales values, indicating bulk purchases.
The code and analysis is available here
