How to bypass varnish cache based on specific header

Question

I've been working on this since 2 days ago, but no luck.

So, basically, I want to bypass the varnish cache for a specific incoming request URL.

I've defined this rule:

sub vcl_recv {
     if (req.url ~ "/en/reading-books/") { return(pass); }
}

But when I refresh the page, it's still being cached, the response headers return these:

via: 1.1 varnish-v4
x-varnish: 2

and this background-running command produces output:

varnishncsa -F '%{Host}i %h %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i"'

What am I missing here? Any pointers are really appreciated.

what version of varnish are you running? This should work fine in varnish 6 atleast. Could you post a obfuscated but complete config? — Orphans, Dec 15 '22 at 08:00

score 2 · Accepted Answer · answered Dec 15 '22 at 08:02

Upgrade to a supported version

Before I talk about debugging caching of that incoming URL, I want to point out that you're running an end-of-life version of Varnish that has known security vulnerabilities.

Please either upgrade to the latest version or use Varnish Cache 6.0 LTS.

See https://www.varnish-software.com/developers/tutorials/#installations for a list of install guides for a variety of Linux distributions.

Matching an exact URL or a URL pattern

The VCL code you shared bypasses the cache for /en/reading-books. Are you trying to do an exact match on that URL or a pattern that matches a set of URLs?

For an exact match, I'd adjust the URL as follows:

sub vcl_recv {
    if (req.url == "/en/reading-books/") { 
        return(pass); 
    }

}

When you're matching multiple URLs, all of which start with /en/reading-books/, I'd adjust the VCL as follows:

sub vcl_recv {
    if (req.url == "^/en/reading-books/.*$") { 
        return(pass); 
    }

}

What about the Varnish-specific response headers

You mentioned the following headers in your question:

via: 1.1 varnish-v4
x-varnish: 2

These indicate that you're using Varnish, not necessarily that Varnish serves the response from the cache.

The via header just informs the user about the fact that Varnish is a proxy server in the response chain.

The value of the x-varnish header usually refers to the ID of the transaction that handled your request.

It's actually the Age header that indicates how long a response has been served from cache for.

Checking the logs

The varnishncsa command has access to the Varnish Shared Memory Logs, but doesn't really display a lot of useful caching information.

The purpose of varnishncsa is the return access log information, similar to what Apache & Nginx return.

Please run the following command to debug the caching:

varnishlog -g request -q "ReqUrl ~ '^/en/reading-books/.*$'"

While varnishncsa produces an NCSA-format single-line response, varnishlog will return the full transaction. Please add the output from varnishlog to your question and I'll help you debug.

This is a pretty clear explanation, especially that regex pattern, it gives me a better understanding of how to process similar incoming requests. You just saved my day, thank you! — Budianto IP, Dec 16 '22 at 00:06
btw, I've upgraded to version 6 as you suggested, it works better now. — Budianto IP, Dec 16 '22 at 00:06

How to bypass varnish cache based on specific header

1 Answers1

Upgrade to a supported version

Matching an exact URL or a URL pattern

What about the Varnish-specific response headers

Checking the logs