25

The title asks it all. The content on the site I'm building wont change very quickly at all and so Memcache could potentially store data for months except for when I put up an update. Is there a way to make it clear the cache every time I deploy the site? I'm using the Python runtime.

Update 1

Using jldupont's answer I put the following code in my main request handling script...

Update 2

I've switched to the method mentioned by Koen Bok in the selected answer's comments and prefixed all my memcache keys with os.environ['CURRENT_VERSION_ID']/ with the helpful code in the answer's 2nd update. This solution seems to be much more elegant than the function I posted before.

Community
  • 1
  • 1
donut
  • 9,427
  • 5
  • 36
  • 53

4 Answers4

21

Have you tried flush_all() function? Docs here. You'll need a bit of logic & state to detect a new deployment or have a special script to perform the flushing.

Updated: look at the absolute path of one of your script: this changes on every deployment. You can use http://shell.appspot.com/ to experiment:

  import sys
  sys.path

['/base/python_dist/lib/python25.zip', '/base/python_lib/versions/third_party/django-0.96', '/base/python_dist/lib/python2.5/', '/base/python_dist/lib/python2.5/plat-linux2', '/base/python_dist/lib/python2.5/lib-tk', '/base/python_dist/lib/python2.5/lib-dynload', '/base/python_lib/versions/1', '/base/data/home/apps/shell/1.335852500710379686/']

Look at the line with /shell/1.335852500710379686/.

So, just keep a snapshot (in memcache ;-) of this deployment state variable and compare in order to effect a flushing action.

Updated 2: as suggested by @Koen Bok, the environment variable CURRENT_VERSION_ID can be used also (part of the absolute path to script files also).

 import os
 os.environ["CURRENT_VERSION_ID"]
jldupont
  • 93,734
  • 56
  • 203
  • 318
  • Yeah, I'm aware of that function. How would I detect a new deployment, that's my main problem? – donut Dec 31 '09 at 01:56
  • you can look at the absolute path of one of your script: this will change on every deployment. – jldupont Dec 31 '09 at 02:01
  • 2
    You could use remote_api_shell.py to run flush_all() after you deploy. It's manual, but easy. – Adam Crossland Dec 31 '09 at 02:21
  • Thanks a lot for elaborating. I posted the resulting code in my original question. Works better than a charm! – donut Dec 31 '09 at 05:44
  • 6
    Why not just take CURRENT_VERSION_ID from the environment variables? And prefixing the cache keys is faster then adding one extra memcache lookup for each request with the above method. – Koen Bok Dec 31 '09 at 17:29
3

When creating keys for your cached values, include the version of the file that is doing the cache gets/sets in the key. That way, when a new version of the file exists, it will no longer reference the old versions in the cache - they will be left to expire out on their own.

We use CVS and java, so we declare this variable in each file that will do caching:

private static final String CVS_REVISION = "$Revision $";

When you check that file out, you'll get something like this:

private static final String CVS_REVISION = "$Revision: 1.15 $";

You can adapt for your language and version control system if not CVS. Remember to encode special characters out of your keys. We've found that URL Encoding key values works well for memcached.

Matt
  • 2,720
  • 1
  • 15
  • 9
  • 1
    I use git and am not familiar enough with it to know if this is possible. It does seem like it may be a more wholistic solution. But jdupont's solution will work without help from a version control system and more accurately answers the question. – donut Dec 31 '09 at 04:46
  • 2
    I do exactly this, but I take the version from the appengine environment variables. Works great. – Koen Bok Dec 31 '09 at 17:26
  • Glad you found a good solution. This idea works well for us, but the other solution looks better for you. – Matt Jan 01 '10 at 04:34
2

I have not tested this but perhaps if you insert into memcache a key with version # on instance start.

Then when the next instance is started, aka after a deployment, it would check memcache and its local version, if they differ, flush all and re-initalize the key.

Only flaw is what if the key is evicted, could replace memcache to datastore but then your making datastore calls for every instance start.

=edit=

Add to the top of your called python files from app.yaml

# Check if the version is updated
if memcache.get("static-version") == os.environ["CURRENT_VERSION_ID"]:
    pass
else:
    memcache.flush_all()
    memcache.set(key="static-version", value=os.environ["CURRENT_VERSION_ID"])
mindlesstux
  • 308
  • 2
  • 8
  • Please explain more about the key being "evicted", what this means and how it would affect this solution. – donut Sep 15 '11 at 16:01
  • As per the docs at, http://code.google.com/appengine/docs/python/memcache/overview.html#How_Cached_Data_Expires "Values may be evicted from the cache when a new value is added to the cache if the cache is low on memory. When values are evicted due to memory pressure, the least recently used values are evicted first." I would mean if the static-version entry got evicted and if I am thinking right it would not pass, but rather cause the flush_all and the key to be reset. – mindlesstux Sep 16 '11 at 14:07
  • Thanks for explaining. So, wouldn't it be better to just prefix each memcache key with the version instead of keeping the version in an individual value? – donut Sep 16 '11 at 19:56
  • Perhaps if you want to keep old versions of data around in memcache and have, think its called, stranded keys if you prefix your key with the deployed app version. (assuming you don't strip off the dynamic part of the app version that gae assigns to your app version). – mindlesstux Sep 19 '11 at 21:13
  • I was just thinking that prefixing all keys with the version would avoid the problem of the app version value getting evicted. However, since AppEngine evicts "the least recently used values" first and this call to memcache would be at the start of the app then maybe this wouldn't be a problem. The only situation is when a single request fills up the memcache. – donut Sep 19 '11 at 22:28
0

You could just create an admin-only path that would flush the cache when it's accessed.

donut
  • 9,427
  • 5
  • 36
  • 53
  • Well, now that's a good idea! But it would best if this could be automated somehow like jdupont hints at. – donut Dec 31 '09 at 01:58
  • It could be automated by hacking appcfg.py to include a call to the cache flushing URL. – Adam Crossland Dec 31 '09 at 02:43
  • 2
    Eek, forking the SDK! I'd rather call appcfg.py from a script to upload, and have that script do whatever else, than modify it. Especially since I already only upload from a makefile anyway... – Steve Jessop Dec 31 '09 at 02:53