5

It turns out that retrieving low level datastore entities that are stored memcache is painfully slow. Since objectify caches entities as the low level datastore Entity type, this results in poor performance when fetching many entities from memcache using objectify.

The real question is why is it slow to deserialize the Entity type from memcache? I put together a sample project to demonstrate the disparity in retrieving Entities from memcache vs plain strings or a simple Map.

Here's the code:

https://github.com/aleemstreak/perftest or the relevant file: https://github.com/aleemstreak/perftest/blob/master/src/com/rewardly/perftest/PerftestServlet.java

Also, I deployed it so you can see how big a difference this is in production: aleemsandbox.appspot.com/perftest . Its a naive profiler but it does show a huge disparity in performance. Refresh the page a few times to see the difference. Here is some sample output:

Storing String Data Test
-------------------------
generateData: 0ms
storeData: 10ms
fetchData: 9ms


Storing Map Data Test
-------------------------
generateData: 0ms
storeData: 21ms
fetchData: 92ms


Storing Entity Data Test
-------------------------
generateData: 69ms
storeData: 24ms
fetchData: 792ms

The first section shows the time it takes to store 1000 strings in memcache and then fetch it right back. The next example does the same for 1000 Map objects and finally the last example stores and retrieves 1000 low level Entity types. You can see the huge time increase to retrieve Entity types.

Any ideas why entities could be slow to deserialize from memcache?

UPDATE 1

Following a suggestion in one of the answers, I also logged the cumulative size of the objects being stored in memcache and the results no print them out. I also added one other test case - instead of storing an Entity directly, instead, I serialize the Entity to a byte[] myself first, then store it in memcache. Here are the results:

StringBenchmark
----------------
Average Fetch Time: 40.16ms
Fetch Size: 24.41KB


MapBenchmark
----------------
Average Fetch Time: 27.36ms
Fetch Size: 102.54KB


EntityBenchmark
----------------
Average Fetch Time: 1029.88ms
Fetch Size: 463.87KB


EntityPreSerializedBenchmark
----------------
Average Fetch Time: 218.82ms
Fetch Size: 490.23KB

Whats interesting here are the last two results. Although they are roughly the same size, it takes about 1/5 the time to fetch and deserialize the byte[] manually.

The code in the github repo has been updated and the deployed sample app also has the latest code so feel free to run this test there and see the results.

aloo
  • 5,331
  • 7
  • 55
  • 94
  • One thing I forgot to mention is that this code is profiled using appstats as well and all the fetches from memcache are shown to be <10ms. That means that the remainder of the time is spent deserializing the objects.... – aloo Mar 26 '14 at 16:14
  • Are you able to also post for EntityPreSerializedBenchmark how long serialization takes, as well as fetch time? – Nick Mar 26 '14 at 22:44
  • sorry by fetch time do you mean the fetch time to actual memcache as listed by appstats? – aloo Mar 27 '14 at 06:59
  • looking at appstats, every RPC to memcache is sub 10ms – aloo Mar 27 '14 at 07:02

3 Answers3

1

Perhaps I'm nit-picking your wording, but all of the higher level API (JDO, JPA, and Ofy) use the low-level API, so all entities are really LL API entities. So, you note that "Objectify caches entities as the low level datastore Entity type", but wouldn't all higher level datastore API's do the same (assuming they've been instructed to use such caching)? So I don't think this has anything to do with ofy.

Moving on, it seems natural that your 3rd test takes much longer then the others - the Entity type adds considerable overhead relative to the simple String and even Map types. I am a bit surprised that it takes that much longer, but taking a step back, you are fetching 1000 entities, so that is still < 1 ms per entity.

I think you should add 1 more test. There is a java API to get the size in memory of an instance (can't remember it off hand). Determine the memory size of the entities you are using in your test, and then change the string test to use an object of the same size. That way we can isolate whether this has anything to do with the Entity type itself, or is just a result of the considerably greater size of the objects being cached in the 3rd test.


Update in response to new test results... Interesting. That does seem to confirm your theory that it is the serializing done by the memcache code that is causing the slowdown. That is odd, though - wouldn't memcache simply serialize the object in a manner very similar to what you are doing?

Perhaps this post helps:
manual serialization / deserialization of AppEngine Datastore objects


The memcache interface that you are using for fetching Entity objects is:

java.lang.Object get(java.lang.Object key)

https://developers.google.com/appengine/docs/java/javadoc/com/google/appengine/api/memcache/MemcacheService#get(java.lang.Object)

So it's not taking a seralizable, so rather then serializing the way that you are, it is probably using introspection. That would explain why 'EntityPreSerializedBenchmark' is much faster then 'EntityBenchmark'.

Community
  • 1
  • 1
Tom
  • 17,103
  • 8
  • 67
  • 75
  • Actually what I meant was that Ofy stores the low-level Entity class in MEMCACHE when it has caching enabled. I don't think JDO/JPA can cache in memcache automatically so I don't think its an issue for those frameworks. – aloo Mar 26 '14 at 20:37
  • Nonetheless - the issue remains. I updated my question to take into account your feedback and tried an approach of serializing the Entity objects to byte[] before placing them in memcache, similarly, they are deserialized into Entity objects from byte[] after fetched (this is included in the fetch benchmark time. (I updated the benchmark code and deployed it for ppl to play with) – aloo Mar 26 '14 at 20:39
  • Update Q to respond to new info. – Tom Mar 26 '14 at 21:06
  • unfortunately that post is for python. – aloo Mar 26 '14 at 21:08
1

Turns out to be a bug in the appengine sdk. They are aware of the problem and fixing presumably.

aloo
  • 5,331
  • 7
  • 55
  • 94
0

This regression in serialization performance was fixed in the App Engine 1.9.5 SDK.