3

I have deployed a simple Spring boot app in Google App Engine Flexible. The app. has two APIs, one to add the user data into the DB (xxx.appspot.com/add) the other to get all the user data from the DB (xxx.appspot.com/all).

I wanted to see how GAE scales for the load, hence used JMeter to create a load with 100 user concurrency ramped up in 10 seconds and calls these two APIs in half a second delay, forever. While it runs fine for sometime (with just one instance), it starts to fail after 30 seconds or so with a "java.net.SocketException" or "The server responded with a status of 502".

After this error, when I try to access the same API from the browser, it displays,

Error: Server Error

The server encountered a temporary error and could not complete your request. Please try again in 30 seconds.

The service is back to normal after 30 mins or so, and whenever the load test happens it repeats the same behavior as mentioned above. I expect GAE to auto-scale based on the load coming in to handle it without any down time (using multiple instances), instead it just crashes or blocks the service (without any information in the log). My app.yaml configuration is,

runtime: java
env: flex
service: hello-service
automatic_scaling:
  min_num_instances: 1
  max_num_instances: 10

I am a bit stuck with this one, Any help would be greatly appreciated. Thanks in advance.

KayKay
  • 553
  • 7
  • 22
  • What do you see in the Logs Viewer? https://cloud.google.com/appengine/docs/flexible/python/writing-application-logs#viewing_logs – Dan Cornilescu Aug 16 '17 at 17:02
  • See https://stackoverflow.com/questions/38012797/google-app-engine-502-bad-gateway-with-nodejs – Ori Marko Aug 16 '17 at 17:33
  • @user7294900 I have changed it to manual scaling in app.yaml. However, I could not find the way to switch from the "google owned" VM to "self owned" VM in the console. I could not find the instances in the GCE area. Also the detailed error I get in the log is in this link *https://jpst.it/13hHk*. – KayKay Aug 17 '17 at 12:55
  • @DanCornilescu The high level logs show this *https://jpst.it/13hIV*. The detailed logs are mentioned in the previous comment. – KayKay Aug 17 '17 at 13:02
  • I have tried disabling and enabling the "health check". Added cpu_utilization, target_utilization properties in app.yaml. However, still I face the same issue. The app also restarts periodically even after overriding it in my code. – KayKay Aug 18 '17 at 14:39
  • @DanCornilescu Further I checked the nginx logs and found "[error] 32#32: *56619 upstream prematurely closed connection while reading response header from upstream, client: xxxx, server: , request: "GET /all HTTP/1.1", upstream: "http://x.x.x.x:8080/_ah/health", host: "x.x.x.x". – KayKay Aug 22 '17 at 16:19
  • I'm sorry - I don't have any idea - I'm not very familiar with java and/or flex. I can only suggest verifying that your app's health check and starting works properly at a slower rate first - maybe not 100 users, but only just enough to get a 2nd instance launch attempt. Or manually stop a normally running instance to check its automatic re-launch sequence, without a lot of traffic. That needs to work perfectly before you embark in serious scalability tests. – Dan Cornilescu Aug 22 '17 at 17:14
  • Are you hitting any quota limits? Console > App Engine > Quotas – Kenworth Sep 13 '17 at 02:11

1 Answers1

5

The solution was to increase the resource configuration, details below.

Given that I did not set a resource parameter, it defaulted to the pre-defined values for both CPU and Memory. In this case, the default memory was set at 0.6GB. App Engine Flex instances uses about 0.4GB for overhead processes. Given Java is known to consume higher memory, there is a great likelihood that the overhead processes consumed more than the approximate 0.4GB value. Now instances in App Engine are restarted due to a variety of reasons including optimization due to memory use. This explains why your instances went off and it shows Tomcat is starting up (they got restarted) and ends up in 502 error due to the nginx is not able to complete the request. Fixing the above may lessen if not completely eliminate the 502s.

After I have specified the resources attribute and increased the configuration in app.yaml 502 error seems to be gone.

KayKay
  • 553
  • 7
  • 22