I'm designing a .NET client application for an external API. It's going to have two main responsibilities:
- Synchronization - making a batch of requests to API and saving responses to my database periodically.
- Client - a pass-through for requests to API from users of my client.
Service's documentation specifies following rules on maximum number of requests that can be issued in given period of time:
During a day:
- Maximum of 6000 requests per hour (~1.67 per second)
- Maximum of 120 requests per minute (2 per second)
- Maximum of 3 requests per second
At night:
- Maximum of 8000 requests per hour (~2.23 per second)
- Maximum of 150 requests per minute (2.5 per second)
- Maximum of 3 requests per second
Exceeding these limits won't result in immediate lockdown - no exception will be thrown. But provider can get annoyed, contact us and then ban us from using his service. So I need to have some request delaying mechanism in place to prevent that. Here's how I see it:
public async Task MyMethod(Request request)
{
await _rateLimter.WaitForNextRequest(); // awaitable Task with calculated Delay
await _api.DoAsync(request);
_rateLimiter.AppendRequestCounters();
}
Safest and simpliest option would be to respect the lowest rate limit only, that is of max 3 requests per 2 seconds. But because of "Synchronization" responsibility, there is a need to use as much of these limits as possible.
So next option would be to to add a delay based on current request count. I've tried to do something on my own and I also have used RateLimiter by David Desmaisons, and it would've been fine, but here's a problem:
Assuming there will be 3 requests per second sent by my client to the API at day, we're going to see:
- A 20 second delay every 120th request
- A ~15 minute delay every 6000th request
This would've been acceptable if my application was only about "Synchronization", but "Client" requests can't wait that long.
I've searched the Web, and I've read about token/leaky bucket and sliding window algorithms, but I couldn't translate them to my case and .NET, since they mainly cover the rejecting of requests that exceed a limit. I've found this repo and that repo, but they are both only service-side solutions.
QoS-like spliting of rates, so that "Synchronization" would have the slower, and "Client" the faster rate, is not an option.
Assuming that current request rates will be measured, how to calculate the delay for next request so that it could be adaptive to current situation, respect all maximum rates and wouldn't be longer than 5 seconds? Something like gradually slowing down when approaching a limit.