2

I am using GAE Python. I have two root entities:

class X(ndb.Model):  
    subject = ndb.StringProperty()  
    grade = ndb.StringProperty()  

class Y(ndb.Model):  
    identifier = ndb.StringProperty()  
    name = ndb.StringProperty()  
    school = ndb.StringProperty()  
    year = ndb.StringProperty()  
    result = ndb.StructuredProperty(X, repeated=True)  

Since google stores our data across several data centers, we might not get the most recent data when we do a query as shown below(in case some changes have been "put"):

def post(self):  
    identifier = self.request.get('identifier')  
    name = self.request.get('name')  
    school = self.request.get('school')  
    year = self.request.get('year')  
    qry = Y.query(ndb.AND(Y.name==name, Y.school==school, Y.year==year))  
    record_list = qry.fetch()  

My question: How should I modify the above fetch operation to always get the latest data

I have gone through the related google help doc but could not understand how to apply that here

Based on hints from Isaac answer, Would the following be the solution(would "latest_record_data" contain the latest data of the entity):

def post(self):  
    identifier = self.request.get('identifier')  
    name = self.request.get('name')  
    school = self.request.get('school')  
    year = self.request.get('year')  
    qry = Y.query(ndb.AND(Y.name==name, Y.school==school, Y.year==year))  
    record_list = qry.fetch()  
    record = record_list[0]  
    latest_record_data = record.key.get()  
gsinha
  • 1,165
  • 2
  • 18
  • 43

1 Answers1

3

There's a couple ways on app engine to get strong consistency, most commonly using gets instead of queries and using ancestor queries.

To use a get in your example, you could encode the name into the entity key:

class Y(ndb.Model):
  result = ndb.StructuredProperty(X, repeated=True)

def put(name, result):
  Y(key=ndb.Key(Y, name), result).put()

def get_records(name):
  record_list = ndb.Key(Y, name).get()
  return record_list

An ancestor query uses similar concepts to do something more powerful. For example, fetching the latest record with a specific name:

import time

class Y(ndb.Model):
  result = ndb.StructuredProperty(X, repeated=True)

  @classmethod
  def put_result(cls, name, result):
    # Don't use integers for last field in key. (one weird trick)
    key = ndb.Key('name', name, cls, str(int(time.time())))
    cls(key=key, result=result).put()

  @classmethod
  def get_latest_result(cls, name):
    qry = cls.query(ancestor=ndb.Key('name', name)).order(-cls.key)
    latest = qry.fetch(1)
    if latest:
      return latest[0]

The "ancestor" is the first pair of the entity's key. As long as you can put a key with at least the first pair into the query, you'll get strong consistency.

Isaac
  • 758
  • 5
  • 16
  • Thanks for your reply.. I have updated the question with a possible solution in my context based on your answer.. could you pl confirm if that would work.. If it would work, does it have any demerits.. – gsinha Feb 01 '14 at 06:39
  • The 'eventual consistency' that occurs with standard queries is that entities which were recently added or recently changed to match the query might be _missing_. The data inside each matching entity will be up to date in both examples you listed. The get() call is not needed. – Isaac Feb 01 '14 at 06:52
  • Because of "eventual_consistency", the data fetched in my first "fetch" based solution cannot be guaranteed to be up to date.. I am not sure about my second(last or latest) "fetch" and "get" based solution.. – gsinha Feb 01 '14 at 07:05
  • If the fetch returns the entity you are looking for, the _data_ in the entity will already be up to date. It's as if ndb.Query always fetches the keys for matching entities and then calls get() for you. So you definitely don't need to call get(). – Isaac Feb 01 '14 at 07:12
  • Ok.. So you mean to say that a single entity would always be strongly consistent.. and would reflect any recent changes done to it since it is a single entity(or entity group).. Hope i understood it correctly.. There is a doc for db(https://developers.google.com/appengine/docs/python/datastore/structuring_for_strong_consistency) for guidelines for strong consistency but not available for ndb – gsinha Feb 01 '14 at 07:34
  • I misspoke earlier. Assuming your query returns the entity, the additional get() _will_ get the latest data, while the query has no guarantees about staleness. However, for this to work in ndb, you need to disable the app cache for the query -- otherwise the entity will be cached during the query and the get() won't leave the machine. – Isaac Feb 03 '14 at 19:07