How to health check a service in Github?

Question

Struggling to configure a health check in Github workflows.

The container is jboss/keycloak:12.0.4.

Workflow is this:

jobs:
  test:
    name: Test
    runs-on: ubuntu-latest

    services:
      keycloak:
        image: jboss/keycloak:12.0.4
        options: --name keycloak

    steps:
      ...

The container needs around 30-40s to become healthy. I tried two approaches without success.

Services options:

options: --health-cmd curl "http://localhost:8080/auth/realms/master" --health-interval 30s

This worked locally but in the workflow it seems that Github is failing before docker has completed the health check. No matter what values I set for health-interval, Github tries 4 times (not in the interval I passed), then fails.

Hack a healthcheck step:

steps:
  - name: Healthcheck
    continue-on-error: true
    run: |
      echo "HEALTHCHECK, BECAUSE"
      docker exec keycloak curl -s --fail "http://localhost:8080/auth/realms/master" 1>/dev/null
      while [ "$?" != "0" ]; do
        docker exec keycloak curl -s --fail "http://localhost:8080/auth/realms/master" 1>/dev/null
        sleep 10s
      done

This looks bad and doesn't work either. The step is set as continue-on-error, but this doesn't mean the step itself goes past the first docker exec statement.

So any ideas on how to solve this?

score 2 · Answer 1 · answered Nov 10 '21 at 09:53

I am not sure how to make your specific example work, but the health check system seems to be composed of two processes:

Docker's built-in health check system (configurable with the --health-* options)
The runner's polling mechanism, running docker inspect to see if the container is healthy (using exponential back-off, not configurable)

The comment in this GitHub issue explains it in more detail.

I suggest you tweak the following values in such a way that the service becomes healthy before the exponential back-off system gives up:

--health-interval: interval between health checks
--health-timeout: how long it takes before the health check fails
--health-retries: how many failing health checks are allowed in a row
--health-start-period: how long to wait before performing health checks; useful for slow-starting services

score 1 · Answer 2 · answered Apr 11 '21 at 15:57

Same here: I'm trying to follow a similar approach setting up the options for the docker container to be running, like:

        options: >-
          --health-cmd "curl -f http://localhost:8080/auth/realms/master"
          --health-interval 30s
          --health-timeout 10s
          --health-retries 5
          --health-start-period 30s

But without success, apparently, Github is not following this setup because I can see in the log that apparently only tried 1 time to initialize the container:

Waiting for all services to be ready
  /usr/bin/docker inspect --format="{{if .Config.Healthcheck}}{{print .State.Health.Status}}{{end}}" 3cb1....
  healthy
  postgres service is healthy.
  /usr/bin/docker inspect --format="{{if .Config.Healthcheck}}{{print .State.Health.Status}}{{end}}" d77a4....
  starting
  keycloak service is starting, waiting 2 seconds before checking again.
  /usr/bin/docker inspect --format="{{if .Config.Healthcheck}}{{print .State.Health.Status}}{{end}}" d77a4....
  unhealthy
  Error: Failed to initialize, keycloak service is unhealthy.

How to health check a service in Github?

2 Answers2