This depends on your specific requirements.
The way I see it, you have 3 options:
- on change - when your entry gets edited, also delete the existing cache information (also make sure it gets re-created on request)
- periodically - have a cron job that runs once X time, and re-do the whole cache
- percent based (not sure how to call it) - when an entry is requested, do something like this:
(basically below code means once in 1000 requests, the cache for the requested page is cleared)
if (rand(1, 1000) == 666) {
/** clear the cache for current requested page */
}
/** handle request */
Depending on your traffic and amount of information you cache (probably other factors as well), any can be useful.
#3
works great when you have a huge cache, while #2
is great with smaller caches that get updated often.
#1
would be ideal, but has a very big flaw - sometimes you may not be able to track certain changes. For example, you can't really tell when a template file is changed to re-cache it.
It's up to you to determine your exact needs, the amount of traffic you are getting/expecting, the amount of cache you will have, and there are quite a few tools to do these benchmarks (for example Apache Benchmark).
PS: You will most likely need a combination of these
Example:
On an application with a huge cache that changes often, I would to #1
+ #3
, while selecting the perfect percent based on the traffic the application receives and benchmark results.
And, to end the answer on a positive note, here is a very nice quote from Leon Bambrick
There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors.