In this post, you will learn about how to get started with mining Twitter data. This will be very helpful if you would like to build machine learning models based on NLP techniques. The Python source code used in this post is worked out using Jupyter notebook. The following are key aspects of getting started with Python Twitter APIs.
In this section, you will learn about the following two key aspects before you get started with the development for Twitter data mining:
First and foremost, get set up with Twitter developer app. In order to get set up, you will have to do the following:
In order to get set up with Python twitter package, install the following command in your Jupyter notebook cell.
!pip install twitter
Once installed, execute the following command in another cell to ensure the installation.
import twitter
twitter?
The above will print the help documentation of the twitter package. You can read details about Twitter APIs on Twitter API documentation page.
Once you are set up, the next step is to establish a successful connection with Twitter in order to access Twitter APIs to extract content. Here is the code.
import twitter
# These are dummy keys
#
CONSUMER_KEY = 'fabcCMBqAABB43XSEjyMNEFGO'
CONSUMER_SECRET = 'gpIMAbCdSsAAKKtApABCDEZJnvz12erfr9rANcrTGV5af4gfGv'
OAUTH_ACCESS_TOKEN = '1234567897-qNAbCiVABCDERQ5CIjxxfs67lJfEWBQGJO'
OAUTH_ACCESS_TOKEN_SECRET = 'jOneIJFEFGHWaCfu4vzmtABCDDwPmnopqVGRad5GHJTbgF'
auth = twitter.oauth.OAuth(OAUTH_ACCESS_TOKEN, OAUTH_ACCESS_TOKEN_SECRET,
CONSUMER_KEY, CONSUMER_SECRET)
twitter_api = twitter.Twitter(auth=auth)
# Nothing to see by displaying twitter_api except that it's now a
# defined variable
print(twitter_api)
The above prints something like following which indicates that you’ve successfully used OAuth credentials to gain authorization to query Twitter’s API.
<twitter.api.Twitter object at 0x0000028B9FF162E8>
In this section, you will see the example of Twitter API usage in relation to getting location based trends and user timelines. You can get an access to Twitter APIs in Twitter API docs.
import json
# Get access to Where on Earth (WoE) Ids on this
# page, https://codebeautify.org/jsonviewer/f83352
INDIA_WOE_ID = 2282863
# Prefix ID with the underscore for query string parameterization.
# Without the underscore, the twitter package appends the ID value
# to the URL itself as a special case keyword argument.
# The following print location based trends
india_trends = twitter_api.trends.place(_id=INDIA_WOE_ID)
print(json.dumps(india_trends, indent=1))
# The following prints user timelines.
# The screen_name parameter is passed the user handle
twitter_api.statuses.user_timeline(screen_name="vitalflux")
One of the popular implementations can be searching Twitter for recent popular tweets based on retweet count based on hashtags. Here is the code which searches Twitter for hashtag deeplearning (#deeplearning).
#
# Search Twitter for hashtag, #deeplearning
#
tweets = twitter_api.search.tweets(q="#deeplearning", max_results=200)
#
# Print the tweets
#
RETWEET_COUNT_THRESHOLD = 25
for status in tweets['statuses']:
if status['retweet_count'] > RETWEET_COUNT_THRESHOLD:
print('\n\n', status['user']['screen_name'], ":", status['text'], '\nTweet URL: ', status['retweeted_status']['entities']['urls'][0]['expanded_url'],
'\nRetweet count: ', status['retweet_count'])
Pay attention to some of the following aspects in the above code:
In recent years, artificial intelligence (AI) has evolved to include more sophisticated and capable agents,…
Adaptive learning helps in tailoring learning experiences to fit the unique needs of each student.…
With the increasing demand for more powerful machine learning (ML) systems that can handle diverse…
Anxiety is a common mental health condition that affects millions of people around the world.…
In machine learning, confounder features or variables can significantly affect the accuracy and validity of…
Last updated: 26 Sept, 2024 Credit card fraud detection is a major concern for credit…
View Comments
Hello there
I am currently preparing for an exam in data mining by working through the exercises. Now, I replicated your code, except with a few changes, I imported the keys from a config file as I find it more convenient to import my keys from it instead of re-writing them each time. I also changed the woeid to the US ID as we are asked to request the trends from the US.
Now I get an 403 HTTP error hinting towards that I am not having the power to execute this request. I don't know if this has to do with twitters update to the 2.0 API?