0

I am wondering how I can automate my program to fetch tweets at the max rate of 180 requests per 15 minutes, which is equivalent to the max count of 100 per request totaling 18,000 tweets. I am creating this program for an independent case study at school.

I would like my program to avoid being rate limited and end up being terminated. So, what I would like it to do is constantly use the max number of requests per 15 minutes and be able to leave it running for 24 hours without user interaction to retrieve all tweets possible for analysis.

Here is my code. It gets tweets of query and puts it into a text file but eventually gets rate limited. Would really appreciate the help

import logging
import time
import csv
import twython
import json

app_key = ""
app_secret = ""
oauth_token = ""
oauth_token_secret = ""

twitter = twython.Twython(app_key, app_secret, oauth_token, oauth_token_secret)

tweets = []
MAX_ATTEMPTS = 1000000
# Max Number of tweets per 15 minutes
COUNT_OF_TWEETS_TO_BE_FETCHED = 18000 

for i in range(0,MAX_ATTEMPTS):

    if(COUNT_OF_TWEETS_TO_BE_FETCHED < len(tweets)):
    break

    if(0 == i):
        results = twitter.search(q="$AAPL",count='100',lang='en',)

    else:
        results = twitter.search(q="$AAPL",include_entities='true',max_id=next_max_id)

    for result in results['statuses']:
        print result

        with open('tweets.txt', 'a') as outfile:
             json.dump(result, outfile, sort_keys = True, indent = 4)

    try:
        next_results_url_params = results['search_metadata']['next_results']
        next_max_id = next_results_url_params.split('max_id=')[1].split('&')[0]
    except:

        break

1 Answers1

0

You should be using Twitter's Streaming API.

This will allow you to receive a near-realtime feed of your search. You can write those tweets to a file just as fast as they come in.

Using the track parameter you will be able to receive only the specific tweets you're interested in.

You'll need to use Twython Streamer - and your code will look something like this:

from twython import TwythonStreamer

class MyStreamer(TwythonStreamer):
    def on_success(self, data):
        if 'text' in data:
            print data['text'].encode('utf-8')

    def on_error(self, status_code, data):
        print status_code

stream = MyStreamer(APP_KEY, APP_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
stream.statuses.filter(track='$AAPL')
Terence Eden
  • 14,034
  • 3
  • 48
  • 89
  • Thank you for the code. I already have a streamer programmed. What I want to do is go back the 7-9 days the Twitter REST API allows and get the tweets that mention the query, or in this case $AAPL – Justin6493 Feb 21 '15 at 16:34