0

As it is clear GetOldTweets3 is a python library for accessing old tweets which is not easily feasible with libraries like Tweepy. But recently, there is an unsolved issue with it due to the new Twitter API: https://github.com/Mottl/GetOldTweets3/issues/98.

The question is what is GetOldTweets3 alternative library for retrieving tweets without time constraints? In my experience, Tweepy can not retrieve more than 200 tweets.

ECub Devs
  • 165
  • 3
  • 10

3 Answers3

1

So far the only method of scraping tweets that still seems to work is snscrape's jsonl method. or this https://github.com/rsafa/get-latest-tweets/

Jimmy
  • 172
  • 1
  • 17
0

The 200 tweet limit is a per request maximum. You can retrieve successive "pages" of tweets by using the returned next parameter to request the next page of 200. If you are using the Standard Search API, these requests will stop return tweets older than about a week. With Premium Search API full-archive you can get all tweets going back to 2006.

It is explained in detail here: https://developer.twitter.com/en/docs/twitter-api/v1/tweets/search/api-reference

Jonas
  • 3,969
  • 2
  • 25
  • 30
  • Can you please tell me how exactly can I retrieve more than 200 tweets with Tweepy? `def text_query_to_csv(text_query, count): try: for tweet in api.search(q=text_query, count=count): # Adding to list that contains all tweets tweets.append((tweet.created_at, tweet.user.screen_name, tweet.text, tweet.entities)) tweetsdf = pd.DataFrame(tweets, columns=['Datetime', 'Screen Name', 'Text', 'Entities']) tweetsdf.to_csv('{}-tweets.csv'.format(text_query)) – ECub Devs Sep 23 '20 at 21:35
  • 1
    I don't use Tweepy, but I can give you an example using [TwitterAPI](https://github.com/geduldig/TwitterAPI). https://github.com/geduldig/TwitterAPI/blob/master/examples/page_tweets.py – Jonas Sep 23 '20 at 23:01
0

I would recommend using snscrape. The IDs collected in this way can then be pass to api.statuses_lookup. By using api.statuses_lookup you can download 300*100 tweets per 15 minutes via Twitter api.

# you have a list with all Tweets ids created with snscrape: all_id_list
# split list in a list of list with 100 ids

id_list = [all_id_list[x:x+100] for x in range(0, len(all_id_list), 100)]

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
api = tweepy.API(auth)

# iterated over the list to get per request 100 tweets over the twitter api
for i in id_list:
     tweets = api.statuses_lookup(list(id_list[i]))

     for tweet in tweets:
          print(tweet.text)
padul
  • 134
  • 11