NGINX: Ignoring Certain URL Parameters for Cache Purposes

Question

So say my NGINX cache key looks like this:

uwsgi_cache_key $scheme$host$request_method$request_uri;

... and that's mostly what I want. I want NGINX to make a cache key based on the entire URL, including the querystring. So that

https://example.com/?a=1&b=1

and

https://example.com/?a=1&b=2

... are separate pages, cached separately.

However, say that there are other parameters -- c and d -- that I don't want to affect the cache key. In other words, I want

Case 1

https://example.com/

and

https://example.com/?c=1

and

https://example.com/?c=2

and

https://example.com/?c=1&d=2

... to return the same page from the cache.

Case 2

And I want

https://example.com/?a=1

and

https://example.com/?a=1&d=2

and

https://example.com/?a=1&c=1&d=3

... to return the same page from the cache, which is different from the page in case 1.

I'm looking for a way to construct the uwsgi_cache_key so that it can account for these cases. I don't want to do it through redirects.

The number of parameters that I want to ignore when constructing the key -- c and d, in this example -- is limited; the number of number or parameters that I don't want to ignore is not.

How would you go about doing this? (Yes, this is mostly about fbclid and utm_* and their cousins.)

UPDATE:

Here is a rewrite of @tero-kilkanen's solution with map, in cases where fbclid and launcher are the undesired parameters. I don't know how much this slows down responses.

    map $args $cachestep1 {
        default $args;
        ~^(fbclid=[^&]*&?)(.*)$             $2;
        ~^([^&]*)(&fbclid=[^&]*)(&?.*)$     $1$3;
    }

    map $cachestep1 $cacheargs {
        default $cachestep1;
        ~^(launcher=[^&]*&?)(.*)$             $2;
        ~^([^&]*)(&launcher=[^&]*)(&?.*)$     $1$3;
    }

Tero Kilkanen · Answer 1 · 2019-04-08T18:52:16.437

1

I haven't tested an approach like this, but I think it could work:

map $args $cacheargs {
    ~^(.*)a=.+&(.*)$ $1$2;
}

map $cacheargs $cacheargs1 {
    ~^(.*)b=.+&(.*)$ $1$2;
}

uwsgi_cache_key $scheme$host$request_method$uri$cacheargs1;

First map removes a=.+ from $args and records it to $cacheargs.

Second map removes b=.+ from $cacheargsand records it to$cacheargs1`.

Then $cacheargs1 is used as part of the cache key.

Original answer below.

You can use:

uwsgi_cache_key $scheme$host$request_method$uri$arg_a$arg_b;

This means that the cache key is built using normalized URI (without query arguments), and query arguments a and b.

edited Apr 08 '19 at 18:52

answered Apr 07 '19 at 00:14

Tero Kilkanen

36,796
3
41
63

That won't work for me. I have an unlimited number of possible arguments that I _do_ want to take into account for the cache key. I just need to winnow out a few. – hanksims Apr 07 '19 at 03:39
In that case I think you need to implement a Lua script in nginx which will define the cache key with your logic into a variable, and then you use that variable for the directive. – Tero Kilkanen Apr 07 '19 at 07:15
I guess I was hoping there'd be some tricky stuff I could pull off with the `map` directive, maybe. – hanksims Apr 07 '19 at 18:22
Now that you mentioned, there could be a `map` way. Check my updated answer. – Tero Kilkanen Apr 08 '19 at 06:01
Thanks! It's kludgy, but for now it's better than recompiling nginx with Lua support. Hope it's not killing my response time. I rewrote your map blocks to account for cases when the undesired parameter comes first: `map $args $cachestep1 { default $args; ~^(fbclid=[^&]*&?)(.*)$ $2; ~^([^&]*)(&fbclid=[^&]*)(&?.*)$ $1$3; } map $cachestep1 $cacheargs { default $cachestep1; ~^(launcher=[^&]*&?)(.*)$ $2; ~^([^&]*)(&launcher=[^&]*)(&?.*)$ $1$3; }` – hanksims Apr 08 '19 at 22:11

NGINX: Ignoring Certain URL Parameters for Cache Purposes

Case 1

Case 2

1 Answers1

Linked