1

I'm trying to get some JSON data from a subreddit with Ruby; but it fails, returning a 429 error.

  begin
    request = URI.open(
      'https://www.reddit.com/r/vintageobscura.json',
      {
      "User-Agent"=>"web:myapp:v1.0.0 (by /u/myusername)"
      }
    );
  rescue OpenURI::HTTPError => error
    response = error.io
    raise StandardError.new sprintf('Error while opening document: %s',response.status)
    #puts response.string
  end
end

It works when I load the URL in my browser; and as you see, I have a user-agent defined as per their API rules.

Any idea of why it fails ?

Thanks a lot !

gordie
  • 1,637
  • 3
  • 21
  • 41
  • [429](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429) means "too many requests" – you probably exceed their rate limit. Monitor the `X-Ratelimit` HTTP headers as mentioned in the API rules. – Stefan Oct 06 '21 at 11:24
  • @Stefan Not sure this is the problem, since it works within my browser... ? – gordie Oct 06 '21 at 11:26
  • What do the `X-Ratelimit` headers say, are you above the limit? – Stefan Oct 06 '21 at 11:27
  • Sorry to ask, but how do I check that ? – gordie Oct 06 '21 at 11:42

2 Answers2

1

HTTP 429 means Too Many Requests. Reddit API limits clients, crawlers and scrapers.

enter image description here

There are three response headers you can use to check the rate limiting status. Here is the documentation.

  • X-Ratelimit-Used: Approximate number of requests used in this period
  • X-Ratelimit-Remaining: Approximate number of requests left to use
  • X-Ratelimit-Reset: Approximate number of seconds to end of period
require "open-uri"

URI.open(
  'https://www.reddit.com/r/vintageobscura.json',
  "User-Agent"=>"web:myapp:v1.0.0 (by /u/myusername)"
) { |f| pp f.meta }

# {
#   "x-ratelimit-remaining"=>"289",
#   "x-ratelimit-used"=>"11",
#   "x-ratelimit-reset"=>"125",
#   ...
# }

The Shopify/limiter gem can help your code wait a certain time after requesting again. Or you can implement this waiting mechanism on your own, using the headers.

vinibrsl
  • 6,563
  • 4
  • 31
  • 44
0

For anyone interested, I ended up using oAuth and requesting a token each time. Could probably be improved, but it works :

def get_access_token()
  puts "getting reddit access token"
  begin
    resp = RestClient::Request.execute(
      method: :post,
      url: 'https://www.reddit.com/api/v1/access_token',
      user: @client_id,
      password: @client_secret,
      payload: 'grant_type=client_credentials'
    )
    response = JSON.parse(resp.body)
    response['access_token']
  rescue StandardError => e
    raise StandardError.new 'Error getting Reddit OAuth2 token.'
  end
end

def get_json(subreddit_slug){
  
  url = sprintf('https://oauth.reddit.com/r/%s.json',subreddit_slug);

  token = get_access_token()
  
  options=>{
    'Authorization' => "Bearer #{token}",
    "User-Agent" => "ruby:#{APP_NAME}:v#{APP_VERSION} (by /u/#{REDDIT_USER})"
  }
  request = URI.open(url,options);
  request.read
}
gordie
  • 1,637
  • 3
  • 21
  • 41