This is a project from British Airways' job simulation on Forage. All code can be found here.
The first task was to scrape and collect customer feedback from a third-party source and analyse the data to uncover insights. The Beautiful Soup library was used to scrape reviews for British Airways from airlinequality.com.
A CSV file containing 1000 reviews was obtained.
The initial CSV generated was unclean as the first column was not titled and the 'reviews' column was not formatted appropriately. Steps were taken to clean the CSV file. They include splitting text into columns, trimming whitespace, and creating data groups.
After cleaning, the data was analysed. Word clouds were generated, and sentiment analysis and topic modelling were performed.
Apart from the common words such as 'flight', 'BA' and 'British Airways', some words that stand out are 'service', 'food', 'seat', 'time', 'good', and 'cancelled'.
A glimpse at the sentiment distribution shows that over 60% of the reviews were generally positive. Over 30% of the reviews were negative, and some were neutral.
Two more word clouds were generated to find out the common words used in positive and negative reviews. For positive reviews, words that stand out include 'food', 'seat', 'time', and 'service'. For negative reviews, words that stand out are 'time', 'seat', 'luggage', 'customer service', and 'cancelled'.
Based on the analysis, the following can be deduced: