34

I've deployed a nodejs app running on the Google App Engine Flex runtime using the following app.yaml configuration:

runtime: nodejs
env: flex
health_check:
  enable_health_check: True
  check_interval_sec: 20
  timeout_sec: 4
  unhealthy_threshold: 2
  healthy_threshold: 2

According to the health check documentation the health checks should hit the /_ah/health endpoint every 20 seconds. However I noticed that my app is getting spammed with these health checks multiple times per second, even though the app responds with 200 status code:

enter image description here

Any idea why this is happening?

Dan Cornilescu
  • 39,470
  • 12
  • 57
  • 97
Mihai Tomescu
  • 1,525
  • 16
  • 21
  • 1
    @DanCornilescu fixed it – Mihai Tomescu Mar 16 '17 at 18:14
  • 1
    How many instances running? – Dan Cornilescu Mar 16 '17 at 18:17
  • 1
    @DanCornilescu one instance – Mihai Tomescu Mar 16 '17 at 18:19
  • 2
    BTW - better to use the image tool - it allows inlining the images. – Dan Cornilescu Mar 16 '17 at 18:21
  • 1
    But *how* did @DanCornilescu fix it? Please share. – Eliot Mar 19 '17 at 11:00
  • 1
    @Eliot Mihai's 'fixed it' comment was in response to my comment about the initial image link in the question being broken, which I deleted after the link was fixed. Not about the actual problem ;) – Dan Cornilescu Mar 19 '17 at 12:29
  • 2
    FWIW, there is inconsistency in how that particular config value is interpreted, maybe try out a different value? See http://stackoverflow.com/questions/42886929/deploying-to-google-app-engine-does-not-work-due-to-health-check-interval-even-t/42887051#42887051. – Dan Cornilescu Mar 19 '17 at 13:36
  • 1
    @DanCornilescu Good find I'll keep an eye on that issue. For now I've turned off health checks entirely. – Mihai Tomescu Mar 21 '17 at 20:22
  • 3
    I am also having issues with it, its a total mess from what I can see – 1977 Mar 29 '17 at 21:23
  • 1
    am also facing this issue, i couldn't even able to stop is, even after setting enable_health_check: False – Ramesh Lingappa Apr 07 '17 at 07:52
  • Is anyone able to atleast disable the health check? `enable_health_check: False` does nothing for me also. I'm responding to `/_ah/health` with a 200 status code and `OK` body. Still getting spammed with a million health check pings per second. – hzhu Apr 15 '17 at 21:32
  • @Mihai Tomescu, how did you disable the health checks? `enable_health_check: False` isn't working for some of us. Did it work for you? – hzhu Apr 15 '17 at 21:33
  • @HenryZhu yes setting `enable_health_check: False` worked for me. Make sure your `app.yaml` is formatted properly and redeploy (see https://cloud.google.com/appengine/docs/flexible/nodejs/configuring-your-app-with-app-yaml#health_checks). – Mihai Tomescu Apr 17 '17 at 19:52
  • +1, I'm also hitting this on python flexible environment, and setting health check to False has no effect. – dstrockis Apr 19 '17 at 03:34
  • This bug makes it impossible to use the log for its intended purpose. – pscl Jun 02 '17 at 20:33

4 Answers4

15

Unfortunately it does seem like we have a bug on our docs. Today, indeed, apps do get health checked on a pretty frequent basis.

The reason is many fold, but in general each VM will be hit by 3 * 2 different health checks at the recurrence interval you specify (by default, the, very aggressive, 1 sec). The reason for this is 2 types of health check (autohealer and LB ones) and 3 of each for availability reasons.

That being said, we are currently working on a new shape of health checks that will be released pretty soon and should fix this and other problems with the existing health checking behavior (at least make the defaults more manageable and giving more tuning options to users).

Stay tuned!

Andre Rodrigues
  • 366
  • 2
  • 4
  • 2
    Are there any [public issues](https://issuetracker.google.com/issues?q=componentid:187191%2B) tracking this new feature/change that the community can follow for updates? – Nicholas Apr 24 '17 at 14:13
  • 1
    What's the tracking bug for this issue and when can we expect a fix? This bug is rampant https://stackoverflow.com/questions/41732767/google-cloud-app-engine-pinging-my-server-despite-disabling-health-check https://stackoverflow.com/questions/30238547/log-of-ah-start-in-app-engine-instance https://stackoverflow.com/questions/41333978/app-engine-flexible-environment-java-servlet-getting-too-many-ah-health-reques https://stackoverflow.com/questions/38604089/specify-google-app-engine-health-check-endpoint – N S May 31 '17 at 17:15
  • So is the current recommendation to use ["updated health checks"](https://cloud.google.com/appengine/docs/flexible/nodejs/configuring-your-app-with-app-yaml#updated_health_checks)? Going by the documentation, that's in beta, the "legacy health checks" are still the supported, and most reliable, solution, and we should use them unless we need the new features. But legacy health checks do not appear to work as documented. – aldel Oct 30 '17 at 19:31
  • So is this still not resolved? I'm using the updated health checks and am also seeing 5 readiness checks per second and liveness checks coming in in bursts of 12 over a few seconds every interval (30s). – Peter Mar 07 '18 at 17:08
  • We are also having the same issue as @Peter here. Moved over to the new health checks and are getting hit by 12 checks every 30 seconds. – KingChezzy Mar 12 '18 at 10:56
  • I'm also experiencing the same issue with updated health checks. My liveness checks configured with `check_interval_sec: 30` and I'm receiving up to 4 health check request per second, which doesn't make any sense even considering there are several checkers sending requests for reliability purposes. – dmitryb Mar 15 '18 at 09:00
3

I don't have a solution to the root problem. But if the spamming is making it impossible to use the log for its intended purpose, like it is for me, here is a work around:

  1. Enable the 'Advanced Log Filters' (the tiny down arrow next to the search field in Stackdriver Logging)

  2. Add this to the Search query

    NOT textPayload : (health)

pscl
  • 3,322
  • 25
  • 29
1

I also run NodeJS in GAE Flex env. Health checks were also spamming the server log. The following few things helped me in reducing them:

  1. Although the google documentation (https://cloud.google.com/appengine/docs/flexible/nodejs/configuring-your-app-with-app-yaml#health_checks) says healthcheck config is not required, I set them explicity anyway to lower the frequency of the health check calls.
  2. Use the "Advanced Log Filter" to remove the health check logging from showing up if they are too distracting.
  3. Google documentation (https://cloud.google.com/appengine/docs/flexible/nodejs/how-instances-are-managed) says it's not required to implement a handler for health check, I explicitly implemented it anyway. I added a handler for "/_ah/healthcheck" endpoint in the express.js server, and have the route at the top of app.js file, so the healthcheck requests are responded right away. This helped reduce some noises caused by the health check requests getting into the express app logic.
0

Use the advanced filter and say "NOT _ah/health".

Removing nginx.request log will help as well.

Robert Christian
  • 18,218
  • 20
  • 74
  • 89