C#: Throttle/rate limit outgoing HTTP requests with Polly

Question

I am developing an integration solution that accesses a rate limited API. I am performing a variety of CRUD operations on the API using multiple HTTP verbs on different endpoints (on the same server though). I have been pointed towards Polly multiple times, but I haven't managed to come up with a solution that actually works.

This is what I have in my startup:

builder.Services
    .AddHttpClient("APIClient", client =>
    {
        client.BaseAddress = new Uri(C.Configuration.GetValue<string>("APIBaseAddress"));
    })
    .AddTransientHttpErrorPolicy(builder => 
        builder.WaitAndRetryAsync(new []
        {
           TimeSpan.FromSeconds(1),
           TimeSpan.FromSeconds(5),
           TimeSpan.FromSeconds(15),
        }));

This is just resilience to retry in case of failure. I have a RateLimit policy in a singleton ApiWrapper class:

public sealed class ApiWrapper
{
    private static readonly Lazy<ApiWrapper> lazy = new Lazy<ApiWrapper>(() => new ApiWrapper());
    public static ApiWrapper Instance { get { return lazy.Value; } }
    private IHttpClientFactory _httpClientFactory;
    public readonly AsyncRateLimitPolicy RateLimit = Policy.RateLimitAsync(150, TimeSpan.FromSeconds(10), 50); // 150 actions within 10 sec, 50 burst

    private ApiWrapper()
    {
    }

    public void SetFactory(IHttpClientFactory httpClientFactory)
    {
        _httpClientFactory = httpClientFactory;
    }

    public HttpClient GetApiClient()
    {
        return _httpClientFactory.CreateClient("APIClient");
    }
}

That policy is used in multiple other classes like this:

public class ApiConsumer
{
    private HttpClient _httpClient = ApiWrapper.Instance.GetApiClient();

    public async Task<bool> DoSomethingWithA(List<int> customerIDs)
    {
        foreach (int id in customerIDs)
        {
            HttpResponseMessage httpResponse = await ApiWrapper.Instance.RateLimit.ExecuteAsync(() => _httpClient.GetAsync($"http://some.endpoint"));
        }
    }
}

My expectation was that the rate limiter would not fire more requests than configured, but that does not seem to be true. From my understanding the way it works is that the rate limiter just throws an exception if there are more calls than the limit that has been configured. That's where I thought the Retry policy would come into play, so just try again after 5 or 15 seconds if it did not go through the limiter.

Then I played around a bit with Polly's Bulkhead policy, but as far as I can see that is meant to limit the amount of parallel executions.

I have multiple threads that may use different HttpClients (all created by the Factory like in the example above) with different methods and endpoints, but all use the same policies. Some threads run in parallel, some sequentially as I have to wait for their response before sending the next requests.

Any suggestions on how this can or should be achieved with Polly? (Or any other extension if there is good reason to)

With some APIs, when they return 429 Too Many Requests, the response includes a parameter that says when to try again (either in seconds or an absolute time). So basically, the API tells you when to try again, rather than you trying again and immediately being rejected. — Neil, Sep 13 '22 at 14:18
That's not *Polly's* job. Polly is used for recovery, retries and ensuring you don't exceed the rate limit. How are you going to handle throttled requests though? Will you reject them? Queue them? That's an important application feature, not something you can just get out of a library — Panagiotis Kanavos, Sep 13 '22 at 14:19
What about the code that made those requests? Do you allow it to keep generating requests that can't be served or do you use a backpressure mechanism to tell it to slow down? That's the job of Channels, the Dataflow library, `Parallel.ForEachAsync` etc. — Panagiotis Kanavos, Sep 13 '22 at 14:21
Thanks all for the headsup. Probably it was too simple to assume that Polly would just queue those requests and only send them one by one ensuring the rate limit is not hit. @Panagiotis: I assume I need to create a queue, some mechanism to process the items on the queue and return the responses to the requesting thread. Is there any extension/framework you would recommend to look at? Will have a look at Dataflow, but not sure it is what I need as the requests per se can be fired sequentially, no need to have them in parallel... (Sorry, I'm quite new to C#) — Aileron79, Sep 13 '22 at 14:38
The point of a 429 is to let the server rate limit the client, not the other way around. If you call the server, get a 429, it says "try again in 500ms". If you try again in 490ms, you will get another 429 saying "try again in 10ms". Rate limiting is not under the clients (your) control. — Neil, Sep 13 '22 at 16:02
@Aileron79 It is unclear to me which part of the resilience protocol are you at? Do you want to impose rate limiting on the server-side? Or do you want to define resilient clients which can overcome on throttling? Or both? — Peter Csala, Sep 13 '22 at 16:17
@Peter: I want my client not to send more requests than the server is willing to handle, thus I'd like to throttle the outgoing requests. — Aileron79, Sep 13 '22 at 20:41
@Neil: I don't really agree, the whole idea of rate limiting is to free up resources, to ensure availability and prevent a server from being flooded. While technically you are absolutely right, I feel it makes more sense to have the clients throttle the requests which would reduce traffic and server load. — Aileron79, Sep 13 '22 at 20:46
How do you know a server is 'flooded'? By rate limiting yourself, you are only slowing yourself down. If the server can process 1M requests per second, and you are the only user, why not use all 1M slots? If there 1M requests from other clients, and you send another one, you slowing yourself down isn't going to get the request done any faster, and you may well get a 429 anyway, as that's the standard mechanism. — Neil, Sep 14 '22 at 08:27

score 3 · Answer 1 · answered Sep 14 '22 at 11:01

3

In this post I would like to clarify things around rate limiter and rate gate

Similarity

Both concepts can be used to throttle requests.
They sit between the clients and the server and they know about the server's capacity.

Difference

The limiter as its name implies limits the transient traffic. It short-cuts the requests if there are too many.
The gate on the other hand holds/delays the requests until there is enough capacity.

Algorithms

The rate limiter usually implements the leaky-bucket or token-bucket algorithm
- Polly's ratelimiter implements token bucket
The rate gate usually utilizies some queuing mechanism and timers
- Sample implementation

answered Sep 14 '22 at 11:01

Peter Csala

17,736
16
35
75

1

Thanks a lot for clarifying the terminology, I was not aware of that. Obviously the author of the RateLimiter package that I now use was not aware of that neither, so the package should actually be named RateGate or sth like that as it clearly implements timing functionality. – Aileron79 Sep 14 '22 at 11:09
1

@Aileron79 Yes exactly, the proper name of that package should use the rate gate. – Peter Csala Sep 14 '22 at 12:16

score 2 · Answer 2 · answered Sep 14 '22 at 08:10

Thanks again to @Neil and @Panagiotis for pointing me in the right direction. I wrongly assumed that the Polly rate limiter would actually delay API calls. I found a workaround that probably is not particularly nice, but for my purpose it does the trick.

I installed David Desmaisons RateLimiter package which is super simple to use. In my singleton I have now this:

public TimeLimiter RateLimiter = TimeLimiter.GetFromMaxCountByInterval(150, TimeSpan.FromSeconds(10));

I use this RateLimiter everywhere I make calls to an API endpoint like this:

HttpResponseMessage httpResponse = await ApiWrapper.Instance.RateLimiter.Enqueue(() => _httpClient.GetAsync($"http://some.endpoint"), _cancellationToken);

Does exactly what I originally expected from Polly.

C#: Throttle/rate limit outgoing HTTP requests with Polly

2 Answers2

Similarity

Difference

Algorithms