When can Google Appengine datastore return stale data?

Question

Is there a difference in the results I can expect from this code:

query = MyModel.all(keys_only=True).filter('myFlag', True)
keys = list(query)
models = db.get(keys)

versus this code:

query = MyModel.all().filter('myFlag', True)
models = list(query)

i.e, will models be the same in both?

If not, why not? I had thought that eventual consistency is used to describe how indices for models take a while to update and can therefore be inconsistent with the most recently written data.

But I recently experienced a case where I was actually getting stale data from a query like the second one, where model.myFlag was True for the models retrieved via query but False when I actually got the model via key.

So in that case, where is the data for myFlag coming from?

Is it that getting an entity via key ensures replication across the datastore nodes and returns the latest data, whereas getting it via query simply retrieves the data from the nearest datastore node?

Edit: I read this article, and assuming the Cloud Datastore works the same way as the Appengine Datastore, the answer to my question is yes, entities returned from queries may have stale values.

https://cloud.google.com/developers/articles/balancing-strong-and-eventual-consistency-with-google-cloud-datastore#h.tf76fya5nqk8

Adresing your edit you are correct, always go for the keys or ancestor queries if you want to have strong consistency. @Patrick Costello got you covered. — Jimmy Kane, Feb 12 '14 at 18:19

score 2 · Answer 1 · answered Feb 12 '14 at 17:51

Yes, as you mentioned queries may return stale values. When doing a query, the datastore chooses performance over consistency.

More in-depth: For an entity group, each node has a log of writes which have not been applied yet. When you execute a read or an ancestor query, entity groups that are involved first have their logs applied. However when you execute a normal query the results could be from any entity group so the entity groups are not caught up. Be careful about using the first code example though, the indexes that are used to actually find those entities may not be up-to-date. So it is very possible to not get all entities with myFlag = True. If you are interested, I would recommend reading the Megastore paper.

When can Google Appengine datastore return stale data?

1 Answers1