0

Here is the entity I would like to pass to the task:

class MyData(ndb.Model):
    ...
    text = ndb.StringProperty(indexed=False)
    data = ndb.BlobKeyProperty(repeated=True)
    details = ndb.KeyProperty(kind=Details)

Can I do something like below?

mydata = MyData.query()
mydata = mydata.filter(...)
mydata = mydata.order(MyData.added)
mydata = mydata.fetch(100)
for d in mydata:
  taskqueue.add(url='/worker', payload=d)

How can I extract the data from the payload then? Don't think that self.request.get('payload') will work. Understand that I can pass just the ndb key and read the entity within the task. But it will require additional read operations. Or, can I use keys_only somehow when fetch(100)? keys_only operations are free in accordance with the doc:

Small datastore operations include calls to allocate datastore ids or keys-only queries, and these operations are free.

But are they counted as datastore read operations?

LA_
  • 19,823
  • 58
  • 172
  • 308

2 Answers2

0

d is still an ndb object. To pass that as a dictionary, try this (untested):

taskqueue.add(url='/worker', payload=d.to_dict())

https://developers.google.com/appengine/docs/python/ndb/modelclass#Model_to_dict

GAEfan
  • 11,244
  • 2
  • 17
  • 33
  • This will not work as ``details`` is an ndb.KeyProperty, which is not JSON encodable without some additional work. I'd override ``to_dict`` to handle KeyProperty and make sure to apply ``.urlsafe()`` to convert them to strings. Then, you'll have no problem fetching those entities later. – Josh May 30 '14 at 20:21
0

I would do this with a keys_only Query (as you mentioned, this should incur little to no charge), and I would update your taskqueue to add in a single batch call.

mydata = MyData.query()
mydata = mydata.filter(...)
# the order shouldn't matter, unless you want to make sure that property exists.
# mydata = mydata.order(MyData.added)   
mydata = mydata.fetch(100, keys_only=True)
tasks = [taskqueue.Task(url='/worker', params={'key': key.urlsafe()}) for key in mydata]
taskqueue.Queue('default').add(tasks)
Josh
  • 603
  • 3
  • 16
  • Thank you! What is the advantage of single batch call? I am planning to use unique name for each task (so, I will not have duplicate processing of the same error), what will happen with batch call if there is a task with duplicate name? – LA_ May 31 '14 at 11:59
  • Could you please also help to understand how to extract payload? `self.request.get('payload')` is empty. – LA_ May 31 '14 at 20:27
  • Updated the ``payload`` to ``params`` - with the way it is now you'll use ``self.request.get('key')``. – Josh May 31 '14 at 23:44
  • https://developers.google.com/appengine/docs/python/taskqueue/queues#Queue_add describes what happens when you add tasks that already exist. If you are worried about duplicate tasks per item, I'd add them one at a time and catch ``(taskqueue.TaskAlreadyExistsError, taskqueue.TombstonedTaskError)`` – Josh May 31 '14 at 23:45
  • 1
    Ah, yes, I know how to use `params`, but was interested especially in `payload` ;) – LA_ Jun 01 '14 at 06:00
  • You can retrieve the ``payload`` from ``self.request.body`` in the task handler. :) (payload is the POST data) – Josh Jun 02 '14 at 18:18
  • Could you please post the code? The idea is clear, but I don't understand how to implement it... – LA_ Jun 03 '14 at 06:54