Saima Abbas

Data Analyst | Machine Learning Enthusiast | Data Storyteller

Applying Python to NHS Dataset

Overview:

As part of the Data Analytics Online Career Accelerator (Course 2), I conducted an in-depth analysis of real NHS datasets to explore how the UK's healthcare system is responding to increased patient demand, digital engagement, and missed appointments. This project combined Python-based exploratory analysis, data visualisation, and public Twitter sentiment to derive actionable healthcare insights.



The analysis focused on:

Based on the gathered insights, I was able to make the following recommendations:

Approach:

For this project Python was used for Data wrangling, Scraping and Visualisation (Pandas, Matplotlib, Seaborn, Beautiful Soup). APIs and webscraping were also used.
To answer the NHS's questions, I used Python with Pandas, Seaborn, and Matplotlib to clean, analyse, and visualise four datasets: actual_duration.csv, appointments_regional.csv, national_categories.xlsx, and tweets.csv. Key steps included:

Key Insights:

  • Appointment volume peaked in October 2021 and March 2022, aligning with seasonal health pressures (e.g., flu season). Despite high appointment volumes, average daily utilisation remained within the NHS guideline of 1.2 million per day, indicating that the NHS has adequate capacity at the national level.
Monthly Capacity Utilisation of Daily Appointments
  • General Practice dominates appointment volumes, especially in winter months. GPs and Other Practice Staff share the bulk of the workload.
Number of Patient Appointments per month by Service Settings
  • Face-to-face appointments dominated, but telephone consultations were also heavily used — suggesting a successful hybrid model. Home visits and video consultations remained low but stable, pointing to potential underuse or limited applicability.
Monthly Appointments Counts by Appointment Mode
  • Most appointments are booked same-day or within a week, highlighting the need for short-notice availability.
Time Between Booking and Appointment Over 11 Month Period
  • Missed appointments "Did Not Attend" (DNAs) are highest for bookings made 8–14 days in advance — these may benefit most from targeted reminders.
Missed Appointments by Time Between Booking and Appointment
  • Appointment mode affects attendance: Face-to-face appointments have the highest DNA (Did Not Attend) rate, while home visits and phone consultations see better follow-through.
Missed Appointments over months by Appointment Mode
  • Top Twitter hashtags included #Healthcare, #MedTwitter, and #DigitalHealth — showing strong engagement around innovation, education, and systemic critiques.
Most Frequently Used Hashtags After Removing Overrepresented Hashtags
  • General Practice consistently recorded the highest number of appointments across all months and quarters. Care-related encounters were the dominant context type, especially under national categories like General Consultation Acute and Routine. The top five locations based on record counts were found to be in and around London.
Number of Patient Appointments per Service Setting in 1st Quarter (June - August 2021)

For a complete picture, feel free to look at my report and complete Python code in the Jupyter Notebook on GitHub.