3

To avoid the high latency (spikes) in GAE datastore writes, I want to implement a write-behind cache (using the Java low-level API). This means that data is written synchronously to the memcache, and then asynchronously to the datastore, so that the request can return quickly.

This, however, means that I need to somehow need to deal with Exceptions arising from datastore contentions (e.g. to initiate a retry) also asynchronously. More precisely, I need to be able to react to contention's occurring after the request has returned. How can I do that? Using the task queue for async write processing is not an option, because pushing to the queue is said to be only marginally faster than a datastore write.

If that is impossible, then what are good ways to implement a write-behind cache? Or how to deal with slow writes in a scenario where data loss is not an option.

paul
  • 534
  • 1
  • 4
  • 7
  • Did you check Objectify 4 API. Save and load operations are async in it. Probably you can use it or get some ideas from there implementation. – Kirill Lebedev Jan 11 '13 at 01:58
  • I'm not positive, but from my reading, it looks like Objectify 4's asynch operations still must complete before the request finishes, so they're only async with respect to other code running during the request. I think this question is about deferring the write so it can complete (potentially) after the request returns. – Andy Dennie Jan 11 '13 at 14:11
  • Yes, exactly. Thanks Andy for the clarification. – paul Jan 11 '13 at 15:37

1 Answers1

0

Memcache is volatile and it may flush data at any time, so this approach is very unreliable.

You would be best off using Push Task Queue. Use it via DeferredTask helper class. Here is an example.

Community
  • 1
  • 1
Peter Knego
  • 79,991
  • 11
  • 123
  • 154
  • The memcache is of course only used for speeding up cache-reads, not for persistency. The problem is how to implement the "write-behind" (async) strategy reliably, so that it never fails due to contention. Using the task queue is an option. How fast is it to push something into queue? It would need to be significantly faster than a datastore write to make sense. Also, how much data can I be associated with a task? Thanks! – paul Jan 11 '13 at 11:54
  • You can post 100K of data to a task (or just pass the key to the memcache data, of course). Re: performance comparison of task queue push vs. datastore write... http://stackoverflow.com/questions/6259984/google-app-engine-is-adding-to-the-task-queue-faster-than-doing-a-datastore-wri – Andy Dennie Jan 11 '13 at 14:12
  • Thanks for that link: pushing a task is only marginally faster than a datastore write, so that doesn't seem to be an option. – paul Jan 11 '13 at 15:41
  • What kind of speed are you talking about? Throughput (number of writes in parallel from different requests) or latency of single write? – Peter Knego Jan 11 '13 at 19:31
  • Latency of a single write. – paul Jan 13 '13 at 08:35
  • Nothing you can do to speed up a single write. What kind of latency do you see that you deem unacceptable? – Peter Knego Jan 13 '13 at 09:29
  • Also note that even async calls block the end of request, so what you are doing (memcache+async) is not having desired effect. See the second comment (Nick is a GAE engineer): http://stackoverflow.com/a/7508403/248432 – Peter Knego Jan 13 '13 at 09:39
  • Thanks a lot - if that is so, then I'll use a different strategy (decouple frontend and backend code). – paul Jan 13 '13 at 17:40