Let’s take the example of any product company. The success of the company or its product directly depends on its customer. If the customer likes their product, then the product is a success. If not, then the company certainly needs to improve its product by making changes to it.
How does the company know whether their product is successful or not?
For that, the company need to analyse their customers and one of the attributes is to analyse the customer’s sentiment. This is a concept of Sentiment Analysis.
What is Sentiment Analysis?
It is a process of computationally identifying and categorising opinions from a piece of text and check whether the writer's attitude towards a particular topic or the product is positive, negative or neutral.
Sentiment analysis is a popular topic of great interest and development, and it has a lot of practical applications. There are many publicly and privately available information over the Internet is constantly growing, a large number of texts expressing opinions are available in social media, blogs, forums etc.
Using sentiment analysis, unstructured data could be automatically transformed into structured data of public opinions about products, services, brands, politics or any topic that people can express opinions about it. This data can be very useful for marketing analysis, public relations, product reviews, net promoter scoring, product feedback, and customer service etc.
A corpus. It is a collection of text documents and we want to get sentiment from these texts. For example, if text has great then we want that to be flagged as positive but if we have a word or phrase that’s not great in there, then we want to that to be flagged as negative so we want to keep all of our original text in the original order so that we can capture sentiments.
TextBlob is a Python library that is built on top of nltk. It’s easier to use and provides some additional functionality, such as rules-based sentiment scores. TextBlob finds all of the words and phrases that it can assign a polarity and subjectivity to, and averages all of them together. More information can be found in the documentation.
For each word, we will get a sentiment score (how positive/negative are they) and a subjectivity score (how opinionated are they)
Polarity: How positive or negative a word is. -1 is very negative. +1 is very positive.
Subjectivity: How subjective, or opinionated a word is. 0 is fact. +1 is very much opinion.
In this blog, we will extract twitter data using Tweepy. Do sentiment analysis of extracted tweets using TextBlob library in Python
Importing necessary libraries
Creating a Twitter App
In order to extract tweets, we need to access to our Twitter account and create an app.
We can get the credentials from here:
Creating a function to access Twitter API
We can create a ‘tw’ object by using the above twitter function. Let’s analyse Ellen DeGeneres tweets. A count is a number of tweets. Printing the last ten tweets. More information on Tweepy can be found in the documentation.
Extracted the tweets, now converting into data frame
Before doing sentiment analysis, let’s clean the tweets using a regex function. Created a small function called clean.
Let’s do sentiment analysis using TextBlob. Finding both polarity and subjectivity.
Calculating polarity and subjectivity.
We can see more positives than negatives. Let’s calculate how many tweets are positive, negative, and neutral.
Ellen DeGeneres is undoubtedly one of the famous celebrities, people are more positive and enjoying her show.
The above sentiment analysis is a simple one used by TextBlob. We can also do the analysis by searching for any trending or hashtag on Twitter.
Thanks for reading. Keep learning and stay tuned for more!
Author: Dhilip Subramanian