Sentimental Analysis using Rule Based Approach(Text Blob) over Twitter Data— Part(I)
This Article is the first part of tutorial Blog series of Sentiment Analysis which deals with Rule Based Approach to find sentiments over twitter data. In the end, there is also some exercise that will help the to learn further.
Natural Language Processing is a domain in Machine Learning which deals with the interaction between Computer and Human Languages and Sentimental Analysis is contextual mining of text to extract subjective information in a material to understand public opinion. Majority of the population are active over various social media platform —Showing the world about their interests (likes/Dislikes, complementing comments/criticizing comments,etc.). Many MNCs use this to understand the public opinion of their product and plan their future strategies. It is also used to manipulate the opinions as done by Cambridge Analytica. The best place to gather these opinions is twitter and reddit. On this blog i will be working on twitter.
There are 3 parts to the article:
1) Rule Based Sentimental Analysis — Twitter
2) Classical Machine Learning Approaches — Reddit
3) Deep Neural Network — Twitter & Reddit
You can find all the scripts for this article over my github with proper documentation. There are Three Steps to perform this task over Twitter Data.
Step 1: Get Twitter Data
Step 2: Analyse the Text for Sentiments
Step 3: Visualize and Document Insights
Step 1: Get Twitter Data
Need access to Twitter API
1. Create an account in dev-twitter.
2. Create An Application (If you don’t have a website, type in your twitter account id).
3. Generate Consumer_key, Consumer_secret ,Token_Access, Token_Secret.
Development Time
Libraries: Tweepy,Numpy, json, time, re, os, and pandas
We will Make the code Modular
1. Create a file “twitter_credentials.py” and create 4 variables and assign the twitter keys to it.
2. Create another file in the same location and import twitter_credentials.
3. Create a class to used these variables and authenticate twitter.
4. You can stream tweets of a particular user or particular word.
To Stream twitter containing particular Word.
5a. Create a Twitter Listener Class inheriting StreamListener class. Override on_data and on_error functions
6a. Create a class Stream_tweets to authenticate user and read tweets, here hashed_list is a list containing words of interest.
To Stream Twitter of Particular User
5b. Create a class called User_Tweets to fetch the tweets of a particular user. Here you need to access the API by creating an API Object
7. Store the data to a csv file. You can do this again by creating a class and calling its functionality to do it. But Before this, there has to be some cleaning of text to remove links and special characters.
8. Convert tweets to dataframe and store them to a CSV file.
Here you are ready with your data to run rule based sentimental analysis
All these functionalities to interact with Twitter API was made possible with the help of tweepy library. There are a lot of other cool functionalities this library offers and can be used as per requirement.
Step 2: Analyse the Text for Sentiments
Once the tweets are ready, they can be used for sentimental Analysis.
Development Time
Library Used — textblob
- Analyze_tweet class to feed the tweet to the textblob class object
2. Run some sample text to check the functionality.
3. Create a list of results and store in a csv that can be further used for plotting the graph.
Step 3: Visualize Sentiment
- Once the text is classified and the sentiments are stored. We can plot them and check the sentiments of a person over a product and also over how much positive or negative a person is in general. Visualization gives us the power to understand how is the sentiment of a topic and based on that we can understand how will it affect the community.
Results
Donald Trump is looking really positive in his tweets.
Summary
Text Blob is a rule based sentimental Analysis library. It has a set of words, phrases and conditions that define positive, negative or neutral. It is actually very poor as it is not understanding the meaning of the sentence, hence sarcasms is definitely an outlier for such systems. There are other rule based libraries available that does a similar job.
Further_Learning — Tasks to do
I have set some tasks for you guys which can be further add on to this and to your knowledge.
1. Select a product or a problem and get the sentiment of the people over it. There are some changes and updates that has to be done.
2. You can fork the repository and can update the results.
3. You can also work over other libraries like VADER, etc and perform emoticon and emoji analysis. You can refer to Neptune AI Blog to get started on this task, they have done sentimental analysis by building it from scratch scratch and by using libraries like Textblob, Flair, and Vader.
If you Like the work, do give a clap to the post and a star to my github repo.