0

I am using redash where supervisord is used for process management of redash workers.

It is using supervisord eventlistener for healthchecks.

Source : https://github.com/getredash/redash/blob/5186acb604fbbcf3e07b83ac793682f4c550b91d/redash/cli/rq.py#L98

eventlistener for healthcheck in supervisord worker configuration looks like this

[eventlistener:worker_healthcheck]
serverurl=AUTO
command=./manage.py rq healthcheck
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
events=TICK_60

When above event listener is run as a part of supervisord then output in logs looks like this

2023/02/01 11:34:01 [worker_healthcheck] WorkerHealthcheck: Worker rq:worker:cc79d5a02143436f99d4bccf5af64c5d healthcheck: Is busy? False. Seen lately? False (241 seconds ago). Has nothing to do? True (0 jobs in watched queues). ==> Is healthy? True
2023/02/01 11:34:01 [worker_healthcheck] `RQ Worker Healthcheck` check succeeded for process worker-1
2023/02/01 11:35:01 [worker_healthcheck] Received TICK_60 event from supervisor
2023/02/01 11:35:01 [worker_healthcheck] Performing `RQ Worker Healthcheck` check for process name worker-0
2023/02/01 11:35:01 [worker_healthcheck] Performing `RQ Worker Healthcheck` check for process name worker-1
2023/02/01 11:35:01 [worker_healthcheck] WorkerHealthcheck: Worker rq:worker:ca2878d2e80e4c47a8414478ed15873f healthcheck: Is busy? False. Seen lately? False (304 seconds ago). Has nothing to do? True (0 jobs in watched queues). ==> Is healthy? True
2023/02/01 11:35:01 [worker_healthcheck] `RQ Worker Healthcheck` check succeeded for process worker-0
RESULT 2
OKREADY

But when I take that command out and run directly in the container and NOTE : it doesn't give me same output and also i've to close the stream after some time (^C)

Also to run this command I need to export SUPERVISOR_SERVER_URL=http://localhost:9001

redash@redash-adhocworker-778fc5dbb6-xrnp9:/app$ ./manage.py rq healthcheck
2023/02/01 11:34:14 [worker_healthcheck] Starting the health check for worker process Checks config: [(<class 'redash.cli.rq.WorkerHealthcheck'>, {})]
2023/02/01 11:34:14 [worker_healthcheck] Installing signal handlers.
READY

^C2023/02/01 11:34:58 [worker_healthcheck] Got signal 2
2023/02/01 11:34:59 [worker_healthcheck] Health check for worker process has been told to stop.
2023/02/01 11:34:59 [worker_healthcheck] Done.

When the healthcheck continuosly fails logs looks like this

2023/02/01 06:22:01 [worker_healthcheck] Received TICK_60 event from supervisor
2023/02/01 06:22:01 [worker_healthcheck] No processes in state RUNNING found for process worker
RESULT 2
OKREADY

Questions

  • How can i run healthcheck script in same way outside of supervisord(./manage.py rq healthcheck) in terminal to get healthcheck status ?
  • Can I add autorestart flag in eventlistener configured with healthcheck and does autorestart will restart all the programs configured in supervisord ?
  • By hit and trial I found address for SUPERVISOR_SERVER_URL which is needed to run healthcheck(./manage.py rq healthcheck), how can I find it correctly ?

Any other suggestion is welcomed for how can i restart supervisord programs when no processes are found in RUNNING state ?

SRJ
  • 2,092
  • 3
  • 17
  • 36

0 Answers0