5

I am using a backend to write multiple entities with ndb.put_multi(list_of_entities).

The issue that I am experiencing is that just after that if I make a query then I get no results. If I put a sleep timer for eg 1 sec, I can read the entities that I just wrote.

So eg:

class Picture(ndb.Expando):
    pass

class Favourite(ndb.Expando):
    user_id = ndb.StringProperty(required=True)
    pass

#...make lists with Picture and Favourite kinds
entities = favourites
entities[1:1] = pictures
ndb.put_multi(entities)

favourites = Favourite.query().filter(Favourite.user_id == user_id).fetch(99999, keys_only=True)
logging.info(len(favourites)) #returns 0 in dev_appserver why?

First assumed the problem has to do with caching. But:

Reading NDB Entities Operations on Multiple Keys or Entities:

Advanced note: These methods interact correctly with the context and caching; they don't correspond directly to specific RPC calls.

Reading NDB Caching

The In-Context Cache

The in-context cache persists only for the duration of a single incoming HTTP request and is "visible" only to the code that handles that request. It's fast; this cache lives in memory. When an NDB function writes to the Datastore, it also writes to the in-context cache. When an NDB function reads an entity, it checks the in-context cache first. If the entity is found there, no Datastore interaction takes place.

Queries do not look up values in any cache. However, query results are written back to the in-context cache if the cache policy says so (but never to Memcache).

Hm I am lost here. Everything seems to be ok. Even if query from the console I get the correct sum, but never on the same handler, no matter what function etc.

Only thing I noticed is that when put a wait time.sleep(1) then I get the correct results. So that has to do with the fact that the ndb.put_multi might not complete synchronously or not. So confused....

Jimmy Kane
  • 16,223
  • 11
  • 86
  • 117
  • 1
    Suspect it's this: http://stackoverflow.com/questions/12367904/write-read-with-high-replication-datastore-ndb/12368444 eventual consistency. – Paul Collingwood Jan 07 '13 at 23:17
  • Yes I was also suspecting this @PaulC but then I have it wrong in my mind. If the `put_multi()` returns doesnt that mean that the entities have been put and the indexes and the commit was successful? I am not using the async methods. – Jimmy Kane Jan 07 '13 at 23:27
  • If you are not using an ancestor query or getting the entities by keys, you run into the eventual consistency issue. If you are using an ancestor query or a key to get individual entities, then it's probably something else. – Sologoub Jan 08 '13 at 00:28
  • Guido keeps reminding me that the ndb cache only caches entities that you get by key/id. Queries are not cached. You're definitely running into an eventual consistency issue since you're not using transactions. – dragonx Jan 08 '13 at 03:23
  • And your commit is successful. That means the results will eventually show up in a query. – dragonx Jan 08 '13 at 03:27
  • Thank you. I am looking into this. – Jimmy Kane Jan 08 '13 at 09:27
  • @Sologoub No not using ancestor query, and that solved my issue. That explained a lot. Thank you all. – Jimmy Kane Jan 08 '13 at 09:33
  • So if someone wants to post an nice answer about ancestor queries and eventual consistency to close this question you are welcome. – Jimmy Kane Jan 08 '13 at 09:34
  • 1
    it's OK for you to answer and accept your own question too :) – Paul Collingwood Jan 08 '13 at 09:59
  • Google App Engine's High Replication Datastore (HRD) provides high availability for your reads and writes by storing data synchronously in multiple data centers. However, the delay from the time a write is committed until it becomes visible in all data centers means that queries across multiple entity groups (non-ancestor queries) can only guarantee eventually consistent results. Consequently, the results of such queries may sometimes fail to reflect recent changes to the underlying data. However, a direct fetch of an entity by its key is always consistent. – Jimmy Kane Jan 09 '13 at 17:28

1 Answers1

5

A clear mind in the morning is always better than a dizzy mind at night.

Thank you all for the comments. Problem solved. You lead me in the right way so to answer my question:

I used ancestor queries to get the results correctly. It's worth to mention the following

Understanding NDB Writes: Commit, Invalidate Cache, and Apply

The NDB function that writes the data (for example, put()) returns after the cache invalidation; the Apply phase happens asynchronously.

That means that after each put the apply phase might not have completed.

And:

This behavior affects how and when data is visible to your application. The change may not be completely applied to the underlying Datastore a few hundred milliseconds or so after the NDB function returns. A non-ancestor query performed while a change is being applied may see an inconsistent state (i.e., part but not all of the change). For more information about the timing of writes and queries, see Transaction Isolation in App Engine.

Also some things about consistency between read and writes taken from Google Academy Retrieving data from the Datastore

Google App Engine's High Replication Datastore (HRD) provides high availability for your reads and writes by storing data synchronously in multiple data centers. However, the delay from the time a write is committed until it becomes visible in all data centers means that queries across multiple entity groups (non-ancestor queries) can only guarantee eventually consistent results. Consequently, the results of such queries may sometimes fail to reflect recent changes to the underlying data. However, a direct fetch of an entity by its key is always consistent.

Thanks to @Paul C for constantly helping and @dragonx and @sologoub for helping me understand.

Jimmy Kane
  • 16,223
  • 11
  • 86
  • 117