I’m trying to understand the token bucket algorithm that is used by API Gateway, but one scenario isn’t making sense to me. How does the algorithm work when the burst is lower than the rate? If you did that, wouldn’t your rate limit effectively be your burst limit since you could never pull more tokens out of the bucket?
For example: rate = 100, burst = 50.
T0: no requests are made, so the bucket is filled to 50.
T1: 100 requests are made, so 50 are accepted and 50 are dropped.
Is this understanding correct? If so, why would you ever set rate > burst? In other words, why would API Gateway set their default rate to 10,000 and burst to 5,000?