Saima Abbas

Data Analyst | Machine Learning Enthusiast | Data Storyteller

Turtle Games: Customer Analytics Project

Overview:

As part of the LSE Data Analytics Online Career Accelerator (Course 3: Advanced Analytics for Organisational Impact), this project explored how Turtle Games can use data to improve marketing, customer retention, and decision-making. The goal was to understand loyalty patterns, segment customer groups, and analyse sentiment from customer reviews.

I used R to conduct statistical analysis to understand data distribution, and used Python to perform predictive modelling, clustering, and Natural Language Processing (NLP). Datasets were drawn from customer demographics, loyalty behaviour, and customers' reviews data. Prior to this, correlation analysis was used to explore relationship between variables.

For the statistical analysis, I focused on:

Based on the observations and insights, I was able to make the following recommendations:

Approach:

To answer Turtle Games’ key business questions, I applied a multi-method data analysis strategy combining both Python (Pandas, Scikit-learn, Matplotlib, Seaborn, TextBlob) and R. Key steps included:


Insights Summary:

  • Loyalty peaked among customers aged 32–34, and those with high salary and spending.
Histogram of Age showing age distribution
  • Female and Basic-Education customers showed slightly higher loyalty.
Loyalty Points by Gender
  • Spending and Salary were the strongest loyalty predictors; Product offered limited insight.
Correlation Matrix showing relatioship between variables
  • Five customer cluster swere identified via K-Means segmentation, each with unique income–spending traits.
  • Customers in Cluster 2 (high income & high spenders) were ideal for premium targeting.
The five clusters of customers segmentation

  • Modelling Insights:
    Best regression model was the one with: Salary + Age + Spending (RMSE: 513, MAE: 395).
Regression table showing a comparison of different regression models
  • Top decision tree model was the one with Spending, Salary, Age (R²: 0.9961, MAE: 26).
The best decision tree regressor model
  • Sentiment Insights
    Reviews were largely neutral to positive. Most common words were 'fun', 'game', 'great', and 'love'.
word cloud
  • Negative reviews highlight usability issues, product quality and unclear instructions.
Negative reviews

For a complete picture, feel free to look at my report. Click on the GitHub icon to see my complete Python and R codes.