0

I am measuring the cost of requests to GAE by inspecting the x-appengine-estimated-cpm-us-dollars header. This works great and in combination with x-appengine-resource-usage and x-traceurl I can even get more detailed information.

However, a large part of my application run in the context of task queues. Thus, a huge part of the instance hour costs are consumed by queues. Each time code is executed outside of a request its costs are not included in the x-appengine-estimated-cpm-us-dollars header.

I am looking for a way to measure the full costs consumed by each request. I.e. costs generated by the request itself and the cost of the tasks that have been added by this request.

Ingo
  • 1,552
  • 10
  • 31
  • 1
    if you check the log in appengine.appspot.com, it will list all request and also told you the cpm us dollars for each request. So you can just add the cpm of normal request and task request together. – lucemia Dec 11 '12 at 15:27
  • Thanks! This is okay for manual testing. But how can I get this information programmatically. E.g. is there a URL that I can hit to access this? Also, how can I relate the costs of those tasks back to the original request? Instead of manually testing the costs, I want to measure costs automatically to see how they change over time. – Ingo Dec 11 '12 at 16:45

1 Answers1

1

It is an overkill. There is a tool you can download google app engine log and convert them to sqlite. http://code.google.com/p/google-app-engine-samples/source/browse/trunk/logparser/logparser.py

With this tool, cpm usd for both task request and normal request would be all downloaded together. You can store daily log into separate sqlite file and do as much analysis as you want.

In terms of relate the cost of task back to original request. The log data downloaded with this tool includes the full output of logging module.

  1. So you can simply logging an generate id in the original request
  2. pass the id to task.
  3. logging the received id again in the task request.
  4. find normal and task request pair via id.

for example:

# in org request
a_id = genereate_a_random_id() 
logging.info(a_id) # the id will be included 

taskqueue.add(url='/path_to_task', params={'id': a_id})


# in task request
a_id = self.request.get('id')
logging.info(a_id)

EDIT1

I think there is another possible way to estimate the cost of normal request + task request. The trick is change the async task to sync (assume the cost would be the same). I didn't try it but it is much easier to try.

# in org request, add a variable to identify debug
debug = self.request.get('DEBUG')

if debug:
    self.redirect('/path_to_task')
else:
    taskqueue.add(url='/path_to_task')

Thus, while testing the normal request with DEBUG parameter. It will firstly process the normal request then return x-appengine-estimated-cpm-us-dollars for normal request. Later it will redirect your test client to the relative task request (task request could also be access and trigger via url client as normal request) and return x-appengine-estimated-cpm-us-dollars for task request. You can simply add them together to get the total cost.

lucemia
  • 6,349
  • 5
  • 42
  • 75
  • 1
    Thanks! I agree that parsing the logs is a solution. However, if appstats would include references to background work (e.g. names of submitted tasks) this could be simplified. Every call to ***.appspot.com/appstats/stats?time=123&type=json** will include calls to the task queue API, like **taskqueue.BulkAdd**. So, maybe a more powerful solution would be to change this to include the task name **taskqueue.BulkAdd[task1,task2]**. So that the background work costs can be retrieved by hitting another appstats URL with the names of the submitted tasks. I did not find a feature request on this yet. – Ingo Dec 11 '12 at 19:21
  • 1
    Thanks, that's a great idea! This way we could measure the exact costs since everything is executed synchronously. Unfortunately, the app would then behave differently during debug vs. prod. For instance, in debug the app would respond a lot slower compared to prod. But given the functionality provided by GAE today, this probably is the best solution. – Ingo Dec 12 '12 at 14:02