Twitter Scraper Rate Limit

Question

I am trying to scrape all the "Following" account information (Username, Website, Last Tweet Date) of a certain account. For example https://www.twitter.com/verified/following. As you may see, it has 365.7K Following usernames.

I scraped the usernames and now I have to go to all the links and scrape that data. The code works fine, it gets all the information needed, but after a certain number of link visits, Twitter says I exceeded the Rate Limit and it stops showing any information about the account I visit.

def get_user_info(user):
    """Gets User Info - Username, Website, Last Tweet Date"""
    driver.get(user[0])
    sleep(1)
    username = '@' + user[0].split('/')[-1]
    attempt = 0
    while True:
        try:
            website = driver.find_element_by_xpath("//div[@data-testid='UserProfileHeader_Items']/a").get_attribute('href')
        except NoSuchElementException:
            website = 'No Website'
            attempt += 1
            sleep(1)
        try:
            last_tweet_date = driver.find_element_by_xpath("//time").get_attribute('datetime')
        except NoSuchElementException:
            last_tweet_date = 'No Tweets'
            attempt += 1
            sleep(1)
        if website != 'No Website' and last_tweet_date != 'No Tweets':
            break
        if attempt > 1:
            break

    info = (username, website, last_tweet_date)
    return info

def user_info():
    info_list = []
    users_df = pd.read_csv('UserLinks.csv')
    user_list = users_df.values.tolist()
    for user in user_list:
        info = get_user_info(user)
        info_list.append(info)

    info_df = pd.DataFrame(info_list, columns=['Username', 'Website', 'Last Tweet Date'])
    info_df.to_csv('List2.csv', index=False)

What do you suggest?

Do you use twitter api? https://developer.twitter.com/en/docs/twitter-api/migrate — data_m, Oct 22 '20 at 08:47
You'd have to of course abide by the allowed rate limit to be able to scrape without a suspension. — Lenin, Oct 22 '20 at 08:53

score 1 · Answer 1 · answered Oct 24 '20 at 04:18

Here's my answer to a similar question on rate limits:

How Rate Limit Works in Twitter

Essentially, every API has a rate limit that renews in a certain timeframe. e.g. 15 minutes. So, you need to watch the rate limit headers or keep count yourself. When you get to the rate limit, pause your application and start again on the next rate limit window. Some APIs have a count parameter and you'll want to make sure you set that to max to get the most responses per request. Also, Application auth typically gets more requests than User auth, if it's available for a given API call.

Twitter Scraper Rate Limit

1 Answers1