0

I am developing a Varnish pipeline that serves a mix of public and restricted resources.

Since access to public resources makes up the vast majority (>99.9%) of the traffic, I want to create shortcuts to bypass auth token validation and other such things for non-restricted resources—or better yet, only go through the authN/Z path if the resource is in a sort of blacklist.

This blacklist can contain up to about 1M (within the next few years) UUID4 entries. Such a file in plain text occupies about 3.7Gb on disk, so a machine with good RAM capacity should be able to keep it all in memory.

My question is about how to implement this blacklist so lookups are very fast. I thought about a "native" hash set, or Memcached, or similar methods. Memcached would very likely slow down things if it's distributed. Has anybody implemented a similar approach? Which tools does Varnish have at my disposal?

user3758232
  • 109
  • 5

1 Answers1

1

In Varnish you have no direct access to the hash of an object.

However, as you indicated, you could store a list of restricted resources in a key-value store.

KVStore in Varnish Enterprise

When we talk about pure speed of execution, Varnish Enterprise has a KVStore module. This KVStore is kept in local memory and can be rebuilt from a file when a restart occurs.

Varnish Enterprise is not free of charge and requires a license key to be purchased. If you want a lower barrier of entry for the enterprise version, there are official machine images of Varnish Enterprise on AWS, Azure, GCP & OCI. Both for Ubuntu & Red Hat. You pay no upfront license and you're charged by the hour. See https://aws.amazon.com/marketplace/pp/B07L7HVVMF?ref_=srh_res_product_title for an example on AWS

The Redis VMOD

If Varnish Enterprise is not for you, you could also use Carlos Abalde's Redis VMOD. It's free, it's open source, and does the job quite well.

You can reven run LUA scripts inside VCL to run more intricate logic from within Redis.

If you're afraid that Redis will slow you down, you can limit the number of connections, and even make sure the connection is shared.

Thijs Feryn
  • 1,166
  • 4
  • 5
  • Thanks, these seem good options. Not sure if I'm ready to embark ni the enterprise version with closed-source components, so Redis might be what I want to try first. How about memcached running locally over socket connection? There seems to be a community module for memcached, so I wonder if that would be more efficient than Redis. – user3758232 Aug 26 '20 at 16:45
  • Also, since you mention Lua, would I be able to access the Lua library hash table implementation: https://www.lua.org/source/5.1/ltable.h.html ? – user3758232 Aug 26 '20 at 16:47
  • 1
    @user3758232 not sure about the hash table implementation in Lua, but you could give it a try. I don't think Redis will perform a lot slower, whereas Redis has a lot of extra capabilities. The fact alone that you can run Lua in Redis is a big plus.You also have more data types in Redis, which could make storage & retrieval a lot more efficient. – Thijs Feryn Aug 27 '20 at 08:16