10

I have a question about how to slow down my api requests. For a particular third party API I am hitting allows me to make 3 requests every 2 seconds. If I go over that number I am returned status code 429 along with a wait time in milliseconds.

This api is called often and is a direct result of my own server having incoming requests which are not rate limited.

Since I do not have any need for synchronous handling of the third party api requests I decided to offload the work to my elastic beanstalk worker box on AWS which by default reads from Amazon SQS.

As a result, my worker will throw the SQS message back into the queue if a status code 429 is returned from the third party api. This inevitably makes the api call work when the waitime is reached. This however seems like a bad solution

Is there any way to tell the daemon on the worker box to leave the message in the queue for the allotted wait time? Or can I perhaps set the rate at which the daemon will read from the queue? I'm looking for a proper way (implementation specific) to rate limit using the worker and the queue on AWS. Thank you so much for the help!

EDIT: I would have assumed that there are configurations that could be modified on AWS to do what I am asking but either way I'm looking for specific solutions for the setup I described. I'm not quite sure how to modify or control the daemon on the elastic beanstalk worker box.

AIntel
  • 1,087
  • 5
  • 14
  • 27
  • what is the purpose behind hitting the 3rd party API? what is the trigger for calling it? – ketan vijayvargiya Nov 14 '16 at 18:39
  • I am using a third party email marketing service for populating/updating a client's email marketing account. There are many triggers in my product for for calling it mostly related to updating and populating these marketing accounts in real time. – AIntel Nov 14 '16 at 18:58

3 Answers3

2

As I understand, you have bunch of triggers for calling a 3rd party service and you need to rate-limit your API calls.

The best way is to rate-limit the daemon that is reading from SQS. Depending on the language in which the daemon is written, you should be able to easily find rate-limiter libraries that you can reuse. For e.g., Java and Python have well-tested libraries here and here respectively.

Keep in mind that these libraries will allow X requests per second per worker. If you have one daemon running, X will be 1.5, for your use case. If you have two daemons (for e.g., one each on two different machines), X should be 0.75

ketan vijayvargiya
  • 5,409
  • 1
  • 21
  • 34
  • This sounds like exactly what I need. I'm not sure how to modify the daemon that Amazon has running on my node.js worker instance however. Could you provide a bit more detail on implementing this on a elastic beanstalk worker? – AIntel Nov 14 '16 at 21:44
  • I think this question has been adequately answered here- you asked for a high level design which I think you've gotten here. I'd suggest you raise another question, with details on your current setup. – ketan vijayvargiya Nov 15 '16 at 01:36
  • I added an edit to try to be more clear. I did describe my setup, and your answer is mostly just a restatement of what I originally posted as a possible solution. My question was how to implement this. Thanks for the help though! – AIntel Nov 15 '16 at 16:43
  • @AIntel if you need help with some code, you should write a [minimal example](http://stackoverflow.com/help/mcve), if you want people just to write code for you, this is an off-topic question on SO. – laughedelic Nov 15 '16 at 16:46
  • 3
    I don't believe I need help with any coding, nor do I think much coding is needed (looking at my edit). The question I have is about what people who have my setup as I described (a standard setup on AWS: elastic beanstalk worker option connected to a sqs queue all by default) do to achieve rate limiting. The answer above describes what I want. But I know what I want, I wrote that in my question. What I am looking for is how to implement, or incorporate a solution for the setup I described in the first 4 paragraphs. This is why the title, tags and question describe AWS so much. – AIntel Nov 15 '16 at 17:03
1

It sounds like you grab a message from the SQS queue and while processing that message, you discover that you will not be able to complete the processing now and you want to put off a retry to after a known time in the future.

If that is the case, then you probably want to look at changing the message visibility time.

When you read a message from a queue it is not automatically deleted. Instead, it will automatically be made deliverable again if you do not delete the message within the visibility timeout. The idea is to ensure that all messages get retried until they are deleted, but not retry before the consumer has a chance to process and delete the message. The queue has a default visibility timeout which you can override on a per-message basis.

Note that this approach only makes it so that you will not receive the specific message from the queue prior to the timeout. If your client process continues to try to read messages, it will receive other messages in the meantime. This is probably what you want if there are different rate-limits associated with different messages. If not, though, you might prefer to have your client thread(s) sleep until the rate-limit wait-time has been reached. The details of how to get multiple threads across multiple servers to stop working until a set point in time is not related to AWS and is highly specific to the language you are using. If you decide to go this route, you probably should ask a separate question.

Rob
  • 6,247
  • 2
  • 25
  • 33
0

A pattern I commonly use is tailing the dispatching of next job in my commands. So when a job finishes its work, it recursively calls the next one at the end, this combined with a rate limit check in a header or a timeout respective of the 3rd party APIs rate limit was helpful. Depending on the nature of your state, you can either lookup the next state in the db or pop an id number off an array.

This pattern does have some caveats, first being if you didn't implement it to start, you have to refactor your code, second you may need cron to handle either a sanity check or "tee up" the first job.

This pattern has worked well so far for me, but my user behavior has changed and there are a variety of requests (with different types of jobs) which could be queued at any time, so the jobs need to know how the others impacts the rate limit. This means putting some sort of atomic count in cache, so that they all read determine if it's a good time to execute or re-dispatch themselves for later. Alternatively, execute anyways, catch the 429 error and re-dispatch themselves for later with a delay. Both feel a little dirty.

A built in SQS solution for this exact use case would be neat.

danrichards
  • 203
  • 3
  • 4