0

I believe I have a fairly typical use case, which is very difficult with eventual consistency. I'm wondering if anyone's already created a python framework to help with this.

I have a GET request that issues a query for a set of entities. They are rarely updated. I have a POST request to update one entity at a time, updating the entity affects whether/how they appear in the GET request.

Since the entities rarely change, I'd like to memcache the GET request for a long time, let's say days or weeks. So on the rare chance I get a POST to update an entity, I can clear memcache.

The problem arises if I handle a POST request, update an entity, clear the cache, and a GET request comes in soon after, the eventually consistent datastore query may yet show the old query results, which will then be memcached for the next few days or weeks.

Instead of simply updating the datastore and clearing the cache, I'll need to:

1. update the datastore
2. get the cached query
3. modify the cached query (with the proper sorting too!)
4. update the cache with the new modified query results (with a cas() operation)

This seems like a common enough problem. Is there any python framework that can help alleviate this problem?

ndb doesn't help, since the datastore queries bypasses all caches.

If it matters, I'm currently using django-nonrel, and django-tastypie handles the GET requests.

Charles
  • 50,943
  • 13
  • 104
  • 142
dragonx
  • 14,963
  • 27
  • 44
  • Caching queries is arbitrarily difficult - since there's no way to enumerate or query memcache, there's no way to know what queries might be cached in the general case. I don't think there's a way to do this automatically: if you want to cache queries, you have to do it on a case by case basis. – Nick Johnson Dec 10 '12 at 10:27

1 Answers1

0

From my understanding/experience with memcache on GAE, your data will be evicted, unless you are driving a very large traffic volume, but even then it is not guaranteed.

However, I think you should be able to use Ancestor queries, as they provide strong consistency, by limiting the query's scope to the entity group. From the docs:

To obtain strongly consistent query results, you need to use an ancestor query limiting the results to a single entity group. This works because entity groups are a unit of consistency as well as transactionality. All data operations are applied to the entire group; an ancestor query won't return its results until the entire entity group is up to date. If your application relies on strongly consistent results for certain queries, you may need to take this into consideration when designing your data model.

Assuming your GET request is tied to some sort of user or other identifiable "parent" or that you can create a global parent shared by all entities that would be retrieved by this GET, you will be able to use that parent in order to retrieve a strongly consistent sent.

Here's an example from the docs.

And here's a related question I asked, that shows some nice tricks on setting parent, key and easy to query ID: NDB using Users API to form an entity group

Community
  • 1
  • 1
Sologoub
  • 5,312
  • 6
  • 37
  • 65