12

I want to cache full pages on our web application (thousands of pages) that are rendered by the Rails stack, but don't change very often. Each render is quite expensive in terms of resources.

My understanding of how Varnishd works is that when an initial call is made to a URL, Varnishd will check its cache store, a miss will take place and so the request will be passed through to Rails and resulting page which gets generated is then added to the Varnishd cache.

Any subsequent calls made to that URL and then served from the Varnishd cache, the Rails stack is not involved.

Is this correct or am I way off?

How can have my app tell Varnishd when a specific page has been updated & to reflect any changes made in its cache store?

Is Varnishd a good choice for this purpose?

Thanks for your help - I know these are very basic questions, but docs just don't make this clear (to me at least).

Michiel de Mare
  • 41,982
  • 29
  • 103
  • 134
Jason
  • 22,645
  • 5
  • 29
  • 51

4 Answers4

6

To do dynamic cache invalidation, you can send purge.url {some regexp} from your application server over the management channel. For example, purge.url "^/some/page/$". However, from Rails, it's probably easiest to use the PURGE HTTP method. So instead of doing an HTTP GET, you'd do a PURGE against the same URI:

PURGE /some/page/ HTTP/1.0
Host: example.com

This request has to come in from localhost unless you override that in the configuration.

Some links:

Bob Aman
  • 32,839
  • 9
  • 71
  • 95
3

I recommend reading this guide to HTTP caching by Mark Nottingham: http://www.mnot.net/cache_docs/

In order to use a reverse proxy with caching you'll need to specify expiry times in your http responses. It's generally not possible to "tell" the caching server when new content is available because the protocol is meant to be federated across the internet and you wouldn't want to have to tell servers everywhere in the world when you have new kittin pictures :-)

Rails page caching isn't the same thing at all. That just offloads the work to the web server to serve the files statically but doesn't involve the http protocol in the decision.

Caveat: I should point out that I haven't tried Varnish personally. This answer is based on the (I think correct) assumption that Varnish is a http caching reverse proxy.

noodl
  • 17,143
  • 3
  • 57
  • 55
  • 1
    Correct, Varnish is an HTTP reverse proxy with HTTP caching and optimized for HTTP caching. If you have tried Heroku, then you have tried Varnish. – yfeldblum Jan 05 '11 at 23:33
  • 1
    Most reverse proxies actually have a way to "tell" the cache that new content is available, however you are correct that you need to know which servers to flush, and configure a way to signal them. Varnish has a management interface, a control channel, that you can connect to from your application or manually, so if you know which servers to flush, it's quite easy to do so. – Martijn Heemels Jun 11 '11 at 19:37
2

Page caching is what you probably want. It's going to be simpler to setup and maintain than Varnish. Caching with a reverse proxy does have some advantages when you start to scale to multiple application servers, since you can invalidate the cache in a single place instead of on each application server.

You can configure Varnish to respond to an HTTP PURGE request which will let Rails tell Varnish when a page has changed. Here's a plugin and article along those lines.

James Mason
  • 4,246
  • 1
  • 21
  • 26
  • 1
    If you have simple caching needs, yes. If you need to do anything even remotely complicated with your caching system, Rails page caching is likely to be insufficient. Varnish really shines at the point where most people outgrow Rails page caching. – Bob Aman Jan 06 '11 at 01:29
1

As mentioned in noodl's answer, if using a reverse proxy, that generally makes page expiry something out of your control. An alternative approach is you will need to manage expiry would be to use rails page caching (see section 1.1), this makes rails render the response to disk (into the public directory) the first time an action is called, and you can use your front end webserver to directly serve those html files. I use nginx for this, and have a directive to serve any static files that exist (typically images, but works for html pages too with the correct rewrite to account for .html extension). With the cache managed by rails, you can then expire yourself, like in the example on the guides page there expiring an index when a new item is created.

My understanding is that reverse http proxies are intended for, and help performance when you have very high throughput, since it allows the caches to propagate to parts of the network outside your control, however if it's render time as you suggest, then rails page caching might be a good option for you.

Jeremy
  • 4,880
  • 1
  • 19
  • 15
  • Actually... it's pretty rare these days to find a dedicated reverse proxy that doesn't support cache invalidation prior to hitting the TTL. It's kind of an important use-case. It just not usually done using standard HTTP caching headers. – Bob Aman Jan 06 '11 at 01:26
  • Ah, well that's good to know, I see there's a new answer mentioning how to do just that. – Jeremy Jan 06 '11 at 01:45