Are you sure you need to make an API call to the backend?
One common approach to this problem is to use a Cookie to store the VISITOR_ID
. In Varnish, you could have VCL that looks like this:
sub vcl_recv {
if (req.http.Cookie !~ 'VISITOR_ID') {
# go straight to the backend, bypassing cache.
return (pass);
}
unset req.http.Cookie;
}
sub vcl_backend_fetch {
# for a return (pass), beresp.uncacheable will be set to 'true' at the top of vcl_backend_fetch
if (beresp.uncacheable != 'true') {
unset beresp.http.Set-Cookie;
}
}
The idea here is that on the first request, when the VISITOR_ID
cookie is not yet set, the client request will go to the backend, and a Set-Cookie will be in the response. On subsequent requests, the Client will be sending the VISITOR_ID
cookie, which will allow the request get it's Cookie header unset in vcl_recv
and be a cache hit.
We are also unsetting the Set-Cookie response header in vcl_backend_recv
, because on cache misses, the request from Varnish to the Origin will not contain a Cookie header, so we expect the backend to include a Set-Cookie for VISITOR_ID
in the response. This would cause Varnish by default not to cache this page, unless we unset the Set-Cookie
header.
Instead of allowing the backend to set the Visitor ID, would it be possible to allow Varnish to do this? The benefit to this approach is that the Client can get a cache hit from the first request, which will be a lot lower latency than a backend fetch. This could be done in VCL by setting a unique Cookie in vcl_deliver
, using a unique hash instead of using your backend to do this.
sub vcl_deliver {
set resp.http.Set-Cookie = 'UNIQUE_ID=' + <unique_id>;
}
Where <unique_id>
could be generated using vmod blobsha256
and the request id.
Considering your requirements, if you are serving content from cache while having a unique visitor Id from the backend, you may be thinking about an attribution model. For uncacheable content (such as a purchase), the backend will receive cookies from the client, which will contain the unique VISITOR ID
. However, the backend will not know about any cache hits the client received from Varnish. If this is relevant to collect, you may want to use varnishncsa
with a custom format string, and an infrastructure agent from a data aggregation tool like Grafana, so you can track visitor's path through your website, even including cache hits.