How does the API Gateway token bucket algorithm work when burst < rate?

Question

I’m trying to understand the token bucket algorithm that is used by API Gateway, but one scenario isn’t making sense to me. How does the algorithm work when the burst is lower than the rate? If you did that, wouldn’t your rate limit effectively be your burst limit since you could never pull more tokens out of the bucket?

For example: rate = 100, burst = 50.

T0: no requests are made, so the bucket is filled to 50.

T1: 100 requests are made, so 50 are accepted and 50 are dropped.

Is this understanding correct? If so, why would you ever set rate > burst? In other words, why would API Gateway set their default rate to 10,000 and burst to 5,000?

Matt Timmermans · Answer 1 · 2021-12-14T23:23:50.397

There are a lot of times between 0 and 1.

For example, rate = 100, burst 50:

T = 0: 25 requests are made, bucket empties to 25

T = 1/4: bucket has been refilled to 50 (rate/4 added). 25 requests are made, and bucket empties to 25.

T = 1/2: bucket has been refilled to 50 (rate/4 added). 25 requests are made, and bucket empties to 25.

T = 3/4: bucket has been refilled to 50 (rate/4 added). 25 requests are made, and bucket empties to 25.

etc...

The tokens get added continuously, not all in one big clump.

That's a simplification. I don't think they tell us anywhere what the real implementation does, but it will be similar to just adding tokens whenever a new request comes in, with the number of added tokens = (current_time - last_request_time) * rate, up to the burst limit.

I see. I assumed tokens get added every second instead of over the course of a second. — mymemesarespiciest, Dec 14 '21 at 23:31

How does the API Gateway token bucket algorithm work when burst < rate?

1 Answers1