I want to extract tweets and then use them for a predictive analysis model.
I have a standard twitter developer account and I created a project under which I created an app and I am using the tokens from that app.
My code is as follows:
import os
os.environ['TOKEN'] = 'I put my token here'
def auth():
return os.getenv('TOKEN')
def create_headers(bearer_token):
headers = {"Authorization": "Bearer {}".format(bearer_token)}
return headers
def create_url(keyword, start_date, end_date, max_results = 10):
search_url = "https://api.twitter.com/2/tweets/search/all" #Change to the endpoint you want to collect data from
#change params based on the endpoint you are using
query_params = {'query': keyword,
'start_time': start_date,
'end_time': end_date,
'max_results': max_results,
'expansions': 'author_id,in_reply_to_user_id,geo.place_id',
'tweet.fields': 'id,text,author_id,in_reply_to_user_id,geo,conversation_id,created_at,lang,public_metrics,referenced_tweets,reply_settings,source',
'user.fields': 'id,name,username,created_at,description,public_metrics,verified',
'place.fields': 'full_name,id,country,country_code,geo,name,place_type',
'next_token': {}}
return (search_url, query_params)
def connect_to_endpoint(url, headers, params, next_token = None):
params['next_token'] = next_token #params object received from create_url function
response = requests.request("GET", url, headers = headers, params = params)
print("Endpoint Response Code: " + str(response.status_code))
if response.status_code != 200:
raise Exception(response.status_code, response.text)
return response.json()
#Inputs for the request
bearer_token = auth()
headers = create_headers(bearer_token)
keyword = "xbox lang:en"
start_time = "2021-03-01T00:00:00.000Z"
end_time = "2021-03-31T00:00:00.000Z"
max_results = 15
url = create_url(keyword, start_time,end_time, max_results)
json_response = connect_to_endpoint(url[0], headers, url[1])
When I run the last 2 lines, I get the following error:
(403, '{"client_id":"22361938","detail":"When authenticating requests to the Twitter API v2 endpoints, you must use keys and tokens from a Twitter developer App that is attached to a Project. You can create a project via the developer portal.","registration_url":"https://developer.twitter.com/en/docs/projects/overview","title":"Client Forbidden","required_enrollment":"Standard Basic","reason":"client-not-enrolled","type":"https://api.twitter.com/2/problems/client-forbidden"}')
or sometimes as
(401, '{"errors":[{"message":"Invalid or expired token","code":89}]}\n')
I found this code on towardsdatascience as well as kaggle and wanted to try and run it. I also wanted to take tweets only from a particular country (India) and I know I need to use place_country but I was not sure how to do that. Another thing that I wanted to do was take all (not just those with the query keyword) the tweets (not just 10 tweets like in my code) from the previous day. It would also be great if someone can guide me to a working code to extract tweets.