Does Gender Relate To a Better Spending Score?

Mohammed Alshamasi
4 min readMar 17, 2021

A data-based approach Using a Mall customer data.

Introduction

There are many assumptions and biases when it comes to Spending Score and what Gender spends more than the other, there also different metrics like Age group annual income to name a few.

Now I have a personal viewpoint on the reality of these statements. You likely have your own viewpoint. But what does the Data suggest? Unfortunately, the data sample is very small but what we found is very interesting.

Do Females have a higher Spending Score than Males or it’s the other way around?

Which age group has a higher Spending Score?

What is the distribution of Age group with Gender based on the average of their Spending Score?

Part I: Do Females have a higher Spending Score than Males or it’s the other way around?

This was one of the questions that I was really interested in. We can see that we have more Females than Males in our Dataset which could skew the result a little bit.

The difference was about 12% towards Females regardless we can still track their Spending behavior. We can see the total and the average Spending Score for each Gender.

Figure1. Table for total Spending Score and the Average Spending Score

We can see a huge gap between both Genders when it comes to the total but not so much with the average this could be an issue of an unbalanced Dataset but still, we can derive that on average Females will have a better Spending Score.

Part II: Which age group has a higher Spending Score?

I was really curious what Age group has the highest average Spending Score and how they relate to other Age groups, and I found some really interesting stuff.

Figure 2. Which age group has a higher Spending Score?

We can see that Age groups of 24–29 and 30–35 have a higher Spending Score than any other group what really interested me more is that the Age group of 18–23 has a high Spending Score compared to their average annual income.

Part III: What is the distribution of Age group with Gender based on the average of their Spending Score?

I believe if we combine both questions we could find some confirmation to our first question, since we confirmed that Females tend to have a better average Spending Score we want to see their distribution based on the Age groups we created.

Figure 3. The distribution of Age group with Gender based on the average of their Spending Score?

We can see that almost every Age group is dominated by Females except for the 2 Age groups and they are 24–29 and 36–41 and that could be that Females tend to have a better Spending Score than Males.

Clustering:

I used the Kmeans clustering algorithm with 5 clusters I chose 5 clusters based on the elbow method to find the optimal amount of cluster for your use case.

I used only 2 features to put into the algorithm and they were Annual Income and Spending Score and the algorithm clustered them into 5 clusters.

Figure 4. Clustering based on Annual Income and Spending Score.

These clusters can help us to decide who should we target and give them a push to visit the mall and increase their Spending Score. For example, targeting cluster 1 is a lost cause since they have a very good Annual Income instead their Spending Score is very low. Unlike cluster 4 where they show more signs of increasing their Spending Score.

Conclusion

In this article, we took a look at mall customers Data which had every customer’s Spending Score:

  1. We saw that on average Female’s Spending Score is better than Males.
  2. We found out customers with Age groups of 24–29 and 30–35 tend to have a better average Spending Score than any other Age group.
  3. We also confirmed Females tend to dominate the average Spending Score across almost every Age group.
  4. Finally, we clustered the customers based on their Annual Income and their Spending Score and we chose 5 clusters then we can target them depending on each cluster they fall into

The findings here are observational, not the result of a formal study. So the real question remains:

What marketing strategy will YOU use to target each cluster?

To see more about this analysis, see the link to my Github available here.

--

--