Customer Segmentation Analysis

Data Description

The dataset includes the following features for each customer order:

order_item_quantity: The number of items ordered.
sales: Total sales value of the order.
benefit_per_order: Profit or loss from the order.
order_item_discount: Discount applied to the order items.
order_item_total: Total value of the order after discount.
OrderDuration: Time taken from order placement to shipping.
order_duration: An alternative measure of the order processing time.

Clustering Process

The clustering process involved the following steps:

Data Preprocessing: Data was cleaned, and features were scaled to prepare for clustering.
Elbow Method: Applied to determine the optimal number of clusters, which was found to be 4.
K-Means Clustering: The algorithm was used to segment the customer data into 4 distinct groups.
Analysis: Post-clustering analysis was performed to understand the characteristics of each cluster.

Cluster Interpretation

The clusters were interpreted based on their centroid characteristics:

Segment 0: ‘Casual Shoppers’ - lower in quantity and sales, less sensitive to discounts.
Segment 1: ‘Promotion Sensitive Buyers’ - attracted by discounts, with higher sales variability.
Segment 2: ‘High-Spending Buyers’ - characterized by high sales and order totals.
Segment 3: ‘Bulk Shoppers’ - high quantity but moderate sales values, indicating bulk purchases.

The code and analysis is available here

Share on

Twitter Facebook LinkedIn