0

My application uses Sidekiq to handle long (several minutes) running background tasks. Deployments are done with Capistrano 2 and all processes are monitored with Monit.

I have used capistrano-sidekiq to manage the sidekiq process during deployments but it has not worked perfectly. Some times during the deployment a new sidekiq process is started but the old one is not killed. I believe this happens because capistrano-sidekiq is not operating through Monit during the deployment.

Second problem is that because my background tasks can take several minutes to complete my deployment should allow two sidekiq processes to co-exisit. The old sidekiq process should be allowed to complete the tasks it is processing and a new sidekiq process should start taking new tasks into processing.

I have been thinking about something like this into my deploy script

When deployment starts:

  • I tell Monit to unmonitor the sidekiq process
  • I stop the current sidekiq process and give it 10 minutes to finish its tasks

After the code has been updated:

  • I start a new sidekiq process and tell Monit to start monitoring it.

I may need to move the sidekiq process pid file into the release directory if the pid file is not removed until the stopped sidekiq process has eventually been killed.

How does this sound? Any caveats spotted?

EDIT:

Found a good thread about this same issue.

http://librelist.com/browser//sidekiq/2014/6/5/rollback-signal-after-usr1/#f6898deccb46801950f40ad22e75471d

Mika
  • 1,419
  • 18
  • 37

1 Answers1

0

Seems reasonable to me. The only possible issue is losing track of the old Sidekiq's PID but you should be able to use ps and grep for "stopping" to find old Sidekiqs.

Mike Perham
  • 21,300
  • 6
  • 59
  • 61
  • I tried to stop the sidekiq in before:update_code with a command `if [ -d /home/staging/current ] && [ -f /home/staging/shared/pids/sidekiq.pid ] && kill -0 cat /home/staging/shared/pids/sidekiq.pid> /dev/null 2>&1; then cd /home/staging/current && bundle exec sidekiqctl stop /home/staging/shared/pids/sidekiq.pid 120 ; else echo 'Sidekiq is not running'; fi` but it didn't wait 120 seconds. Instead it killed a running sidekiq thread after 8 seconds. – Mika Sep 20 '14 at 21:27
  • And this is what is logged 2014-09-20T21:00:38Z 25932 TID-ae9x4 INFO: Shutting down 2014-09-20T21:00:39Z 25932 TID-13ikrg INFO: Shutting down 24 quiet workers 2014-09-20T21:00:39Z 25932 TID-13ikrg INFO: Pausing up to 8 seconds to allow workers to finish... 2014-09-20T21:00:47Z 25932 TID-13ikrg WARN: Terminating 1 busy worker threads – Mika Sep 20 '14 at 21:30
  • You have to use `sidekiq -t 120` to set Sidekiq's internal shutdown timeout. sidekiqctl's timeout is how long it will wait before killing sidekiq. – Mike Perham Sep 21 '14 at 19:01