How to perform request coalescing with .net middleware

Question

Firstly, I know there are options for request coalescing, like using an Origin Shield, Varnish, etc.

But those are architectural changes and I was looking for something more straight forward.

My thought was to use some middleware that would first determine if the request being processed was enabled for request coalescing (an attribute on the endpoint) but my lack of understanding middleware blocked me from this option and reaching out to the community for a middleware solution or an alternate solution (yet remaining in the .net project).

My inital look is at .net core 3.1, but also looking for .net 6/7 support.

The idea would be to determine if a request is a duplicate (using query params, headers), if so, re-use the result from the first request. Therefore only 1 request would get processed for a spike of requests that are the same.

Request coalescing may also be known as request collapsing.

I first created an attribute that I could add an endpoint that would enable it to be coalesed with other similar requests, but I didn't even get far enough to actually used it. For those interested, it was the following code:

public class SquashableAttribute : Attribute
{
    public SquashableAttribute()
    {
    }
}

I created the following middleware (and extension method):

public class SquasherMiddleware
{
    private readonly RequestDelegate _next;

    public SquasherMiddleware(RequestDelegate next)
    {
        _next = next;
    }

    public async Task InvokeAsync(HttpContext context)
    {
        string path = context.Request.Path.Value;
        Console.WriteLine($"CHECKING SQUASHABLE FOR {path}");
        await _next.Invoke(context);
        Console.WriteLine($"SQUASHER FINISHED FOR {path}");
    }
}

public static class SquasherMiddlewareExtensions
{
    public static IApplicationBuilder UseSquasher(this IApplicationBuilder builder) => builder.UseMiddleware<SquasherMiddleware>();
}

And a simple endpoint:

[HttpGet]
[Squashable]
public async Task<IActionResult> GetAsync()
{
    await Task.Run(async () => await Task.Delay(10000));

    return Ok();
}

Hooking up in Startup.cs (after app.UseRouting() so we can get the path in the middleware):

app.UseSquasher();

What I noticed when hitting this endpoint multiple times, I would only see the log "CHECKING SQUASHABLE FOR {path}" in the logs for the first request and nothing for the second request until the first request finished. However, if I made a request to this endpoint and then made the request to a DIFFERENT endpoint, then I saw the log for the second request before the first request had complete.

It seems that the middleware doesn't run for the same request until another completes, but different requests run as expected.

Why not use CDN? Pretty high abstraction, no need to change your service. You effectively create 100ms cache (100ms - just ordinary UI "instantenious" tolerable time up to 1 sec which starts to interrupt user). You also have pretty extensive set of options to control how CDN will cache your requests and for how long - https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control — eocron, Feb 24 '23 at 10:20
@eocron because it requires an initial request to complete first to get the cached response. Our APIs are already behind AWS CloudFront that supports Origin Shield that performs request coalescing but the configuration of that in our multi-region setup will be painful. The pain we have is when the caches expire, we get a spike of requests to the origin and we're trying to reduce the impact that it has on the backend. — jackofallcode, Feb 24 '23 at 10:27
Oh, so you have some kind of slow but cacheable operation. Usually I go just by retrurning *429* with *Retry-After=5sec* header on consequent calls to not hold themm all with connections on some spin lock inside service. Then CDN/nginx or any other reverse-proxy can coalesce those. This will effectively reduce spikes on service and transfer them to single caching point, which is CDN — eocron, Feb 24 '23 at 10:33
It's an option, but I think with our current setup it would rely on the client (that we don't build!) — jackofallcode, Feb 24 '23 at 10:37
Sorry, I should confirm, yes, that is correct. We have a cacheable slow operation. We'll probably review Varnish Cache Plus in the end, but interested about a middleware resolution. — jackofallcode, Feb 24 '23 at 10:43
Didn't quite understood from your example, how you use your dictionary. Could you please correct example? In any case you can use polly for this: https://github.com/Polly-Contrib/Polly.Contrib.DuplicateRequestCollapser — eocron, Feb 24 '23 at 10:53
The dictionary was in preparation for storing the task for the first request, i.e. how to determine if the request is already being processed. I saw the Polly thing, but isn't that for downstream requests, not the current request. I'll update the example to remove the dictionary to avoid confusion. — jackofallcode, Feb 24 '23 at 11:06
Not sure what do you mean by "current" requests, because everything executed concurrently, but you can use this policy to coalesce similar concurrent requests (by key, path, args, time, etc), without dealing with pitfalls of caching Task's (which I assume you will eventually stumble if you use dictionary for this). — eocron, Feb 24 '23 at 11:30
What I mean is that Polly is used for downstream requests, we're not making any downstream requests to other services, its likely DB intensive operations. I want to be stopping the request before the host passes it to the endpoint to handle. — jackofallcode, Feb 24 '23 at 12:04
You can wrap any piece of code with Polly, its just monadic operator. Works like middleware (which is monadic operator too). It doesn't require anything. And it will do exactly as you say, it will stop similar requests, until first is completed, then resume them with propagated result. Or you can propagate it with whatever you feel like. — eocron, Feb 24 '23 at 12:12
I will have a look at implementing something like this, would you suggest to create a middleware wrapper around this? If so, I refer to my problem about the middleware not being invoked for the same request until the previous request completed as per my initial description. My first thought was to wrap my method with it, but that would be a lot of changes. — jackofallcode, Feb 24 '23 at 12:31
For me your example **works perfectly fine**, everything concurrent. From your example I can assume only that it is messing with you with Console.WriteLine, because from my experience direct calls in ASP.NET Core Web App to Console class only executed at *some* threads. Or some kind of error is risen and you just don't have a chance to invoke second write line. Without complete working example I can't identify problem. — eocron, Feb 24 '23 at 13:29
I have attached example middleware. Generalization through middleware comes with cost though - you should manually propagate your collapsed context to all other contexts + you basically create memory cache by key which is gonna spike your memory instead of IO, not for long though, when it is complete, everything should go back to normal. — eocron, Feb 24 '23 at 14:02

score 2 · Accepted Answer · answered Feb 24 '23 at 13:55

Here is you reworked middleware through Polly collapser:

public class SquashableAttribute : Attribute
{
}

public class SquasherMiddleware
{
    private readonly RequestDelegate _next;
    private readonly ILogger<SquasherMiddleware> _logger;
    private static readonly IAsyncRequestCollapserPolicy CollapserPolicy = AsyncRequestCollapserPolicy.Create();

    public SquasherMiddleware(RequestDelegate next, ILogger<SquasherMiddleware> logger)
    {
        _next = next;
        _logger = logger;
    }

    public async Task InvokeAsync(HttpContext context)
    {
        var endpoint = context.Features.Get<IEndpointFeature>()?.Endpoint;
        var attribute = endpoint?.Metadata.GetMetadata<SquashableAttribute>();
        if (attribute == null)
        {
            await _next.Invoke(context).ConfigureAwait(false);
        }
        else
        {
            await OnInvokeAsync(context).ConfigureAwait(false);
        }
    }

    private async Task OnInvokeAsync(HttpContext httpContext)
    {
        string path = httpContext.Request.Path.Value;
        var collapserContext = new Context(path);
        collapserContext["http"] = httpContext;
        _logger.LogInformation($"Check if pending: {collapserContext.OperationKey}");
        var collapsedContext = await CollapserPolicy.ExecuteAsync(OnDownstreamInvokeAsync, collapserContext).ConfigureAwait(false);
        if (collapsedContext != httpContext)
        {
            _logger.LogInformation($"Collapse your contexts: {path}");
            //copy past your relevant Body/Headers from collapsedContext to httpContext
        }
    }

    private async Task<HttpContext> OnDownstreamInvokeAsync(Context ctx)
    {
        var httpContext = (HttpContext)ctx["http"];
        _logger.LogInformation($"Executing on downstream: {ctx.OperationKey}");
        await _next.Invoke(httpContext).ConfigureAwait(false);
        return httpContext;
    }
}

Controller:

[ApiController]
[Route("ping")]
public class PingController : ControllerBase
{
    [HttpGet]
    [Squashable]
    public async Task<IActionResult> Get()
    {
        await Task.Delay(10000).ConfigureAwait(false);
        return Ok();
    }
}

Result in logs:

info: WebApplication2.SquasherMiddleware[0]
      Check if pending: /ping
info: WebApplication2.SquasherMiddleware[0]
      Executing on downstream: /ping   <--------- single call to your endpoint
info: WebApplication2.SquasherMiddleware[0]
      Check if pending: /ping          <--------- every other call is waiting
info: WebApplication2.SquasherMiddleware[0]
      Check if pending: /ping
info: WebApplication2.SquasherMiddleware[0]
      Check if pending: /ping
info: WebApplication2.SquasherMiddleware[0]
      Check if pending: /ping
info: WebApplication2.SquasherMiddleware[0]
      Check if pending: /ping
info: WebApplication2.SquasherMiddleware[0]
      Collapse your contexts: /ping    <----- released all of them when first is complete. You are free to use HttpContext to populate all other HttpContexts (with body, headers, etc)
info: WebApplication2.SquasherMiddleware[0]
      Collapse your contexts: /ping
info: WebApplication2.SquasherMiddleware[0]
      Collapse your contexts: /ping
info: WebApplication2.SquasherMiddleware[0]
      Collapse your contexts: /ping
info: WebApplication2.SquasherMiddleware[0]
      Collapse your contexts: /ping

Looks like this is working so far and is a better solution than storing the `Task` that I was thinking about. I still come across my middleware only being invoke once per request, but it looks like that might have been because I was doing two requests through Chrome. If I do the second request through a different browser, I see it checking for pending. Now I just need to fugure out the correct way to copy the response between the requests. — jackofallcode, Feb 24 '23 at 14:37
Might also be that using the HttpContext directly can lead to ObjectDisposedException if the orignal context gets disposed before all the other requests finish copying the response. This happend when I tried to debug one of the collapsed requests. — jackofallcode, Feb 24 '23 at 14:44
Sharing the response is an interesting challenge, `httpContext.Response.Body` is not readable. I've seen suggestions of overriding it with memory stream, which I sort of have working, but the stream needs to be disposed, so I get a disposed exception. Still working on a solution. — jackofallcode, Feb 24 '23 at 16:10
Just read stream body before _next.Invoke, replace it with memorystream, then do the same with response after _next.Invoke. You can disable dispose on memorystream if you write wrapper around it which does nothing on dispose. Looking at all this mess which comes with middleware you better to just cache your endpoint results. — eocron, Feb 24 '23 at 20:28
That is what we've done, essentially we're going to cache the result in memory during the downstream invoke in memory, then each other request grabs it from the cache. The cache should only need to last for a few seconds to allow the other requests to grab the data. I will likely update your answer next week once we're happy with it. Then I can accept the answer :). — jackofallcode, Feb 25 '23 at 06:40

How to perform request coalescing with .net middleware

1 Answers1