3

We have a Google App Engine Java app with 50 - 120 req/s depending on the hour of the day.

Our frontend appengine-web.xml is like that :

<instance-class>F1</instance-class>
<automatic-scaling>
    <min-idle-instances>3</min-idle-instances>
    <max-idle-instances>3</max-idle-instances>
    <min-pending-latency>300ms</min-pending-latency>
    <max-pending-latency>1.0s</max-pending-latency>
    <max-concurrent-requests>100</max-concurrent-requests>
</automatic-scaling>

Usually 1 frontend instance manages to handle around 20 req/s. Start up time is around 15s.

I have a few questions :

  • When I change the frontend Default version, I get thousands of Error 500 - Request was aborted after waiting too long to attempt to service your request. So, to avoid that, I switch from one version to the other using the Traffic splitting feature by IP address, going from 1 to 100% by steps of 5%, it takes around 5 minutes to do it properly and avoid massive 500 errors. Moreover, that feature seems only available for the default frontend module.

-> Is there a better way to switch versions ?

  • To avoid thousands of Error 500 - Request was aborted after waiting too long to attempt to service your request., We must use at least 3 Resident (min-idle) instances. And as our traffic grows, even with 3, we sometimes still get massive Error 500. Am I supposed to go to 4 residents? I thought App Engine was nice because you only pay for the instances you use, so if in order to work properly we need at least half our running instances in Idle mode, that's not great, is it? It's not really cost effective as when the load is low, still having 4 idle instances is a big waste :( What's weird is that they seem to wait only 10s before responding 500 : pending_ms=10248

-> Do you have advices to avoid that ?

  • Quite often, we also get thousands of Error 500 - A problem was encountered with the process that handled this request, causing it to exit. This is likely to cause a new process to be used for the next request to your application. If you see this message frequently, you may be throwing exceptions during the initialization of your application. (Error code 104). I don't understand, there aren't any exceptions, and we get hundreds of them for a few seconds.

-> Do you have advices to avoid that ?

Thanks a lot in advance for your help ! ;)

Etienne
  • 226
  • 1
  • 12

1 Answers1

1

Those error messages are mostly related to loading requests which last too long in being loaded and therefore they finish in something similar to a DeadlineExceededException, which affects dramatically the performance and users experience as you probably already know.

This is a very often issue specially when using DI frameworks with google app engine, and so far it´s an unavoidable and serious unfixed issue when using automatic scaling, which is the scaling policy that App Engine provides for hanndling public requests since its inception.

Try changing the Frontend Instance Class to F2, specially if your memory consumption is higher than 128MB per instance, and set to 15s the min-max pending latency so your requests will get more chances to be processed by a resident instance. However you will get still long response times for some requests, since Google App Engine may not issue a warmup request every time your application needs a new instance, and I undestand that F4 would break the bank.

Joar
  • 224
  • 2
  • 14
  • In fact, I optimized the code through profiling. I managed to mainly improve 2 things that were big bottlenecks. First, I get rid of java jackson framework to parse JSON (80%) and improved memCache management (20%). Now an F1 instance can handle up to 80-90 req/s. I don't use any DI framework. I tried F2 instances but it wasn't really better (and cost me twice ^^). For the moment with the improved performance my number of 500 really dropped and I can switch version more easily. – Etienne Feb 05 '14 at 08:33
  • I now can use only 2 resident instances (and I think 1 would be enough). Also, I now only have 50 concurrent requests so when an instance dies, less 500 will be thrown and I still have the same instance perf. I don't like to put 15s for min-max pending latency as I want to avoid as much as possible latency for end users. My memory consumption is generally 180-200mo on the app engine console but it works perfectly with F1 instances. I don't get any OutOfMemoryException – Etienne Feb 05 '14 at 08:47