2

In our GAE application, we occasionally see TombstonedTaskError errors in deferred tasks due to named tasks being submitted multiple times with identical names. These tasks seem to be occasionally resubmitted automatically by GAE despite the first execution of the deferred task succeeding.

An example can be seen in this log screenshot: the task "refresh_stock_status-1451012400-GNeg-completion-poll-2" was submitted on the morning of Dec 25th, and completed 12-25 07:35:05. The task was then, for some reason, retried automatically by GAE ("X-Appengine-Taskretrycount:1") at 07:35:18, with the specified reason being "X-Appengine-Taskretryreason:Instance Unavailable". Clearly, an instance (with id ending with "...2976") had been available, and the task had already completed. Why was it retried like this? What can be done to prevent it? While rare, these errors are still misleading and checking them takes time during monitoring of our application.

The only similar situation I've found described online is at https://groups.google.com/forum/#!topic/google-appengine/0JWCp3OGnMI . No solution was offered beyond fiddling with the instance scaling configuration (which might reduce the error incidence) though, which in our case would have too many side effects.

Martin
  • 56
  • 4
  • It seems the code that executed in the queue is enqueuing another task with the same name, that's why you get tombstone error. AFAIK, task is guarantee to be executed at least once, that's why it needs to be idempotent – marcadian Jan 05 '16 at 23:12
  • As can be seen from the GAE headers in the logs, the second (07:35:18) task was launched automatically as a retry, due to "instance unavailable". It was not caused by a deferred.defer() call. – Martin Jan 06 '16 at 10:14
  • What sort of error is "Instance Unavailable"??? – ZiglioUK May 13 '16 at 03:18
  • I get that error when a task has reached its 10 min timeout and has been killed. The stupid new console hides that. – ZiglioUK May 13 '16 at 04:21

1 Answers1

0

After being in touch with Google support, it's emerged that the reason is that very rarely, deferred tasks can be executed multiple times by GAE. This means that tasks have to be implemented such that they can safely be executed multiple times without any errors, which was not the case for us (since we assumed that tasks would only be executed once).

Martin
  • 56
  • 4