2

I have a problem running resque workers from god.

Here is my god config

num_workers = 9 
queue       = '*'
current_path = "/u/apps/narg/current"

God.pid_file_directory = "/u/apps/narg/current/tmp/pids"
num_workers.times do |num|
  God.watch do |w|
    w.name     = "resque-#{num}"
    w.group    = "resque_all"
    w.interval = 30.seconds
    w.env      = {"QUEUE"=>queue, "RAILS_ENV"=>"production",
             'PIDFILE' => "#{current_path}/tmp/pids/#{w.name}.pid" }
w.start    = "cd #{current_path} ; bundle exec rake environment resque:work"
w.log      = "#{current_path}/log/god-#{w.name}.log"
w.pid_file = "#{current_path}/tmp/pids/#{w.name}.pid"
w.uid = 'root'
w.gid = 'root'
w.behavior(:clean_pid_file)
# retart if memory gets too high
w.transition(:up, :restart) do |on|
  on.condition(:memory_usage) do |c|
    c.above = 150.megabytes
    c.times = 2
  end
end

# determine the state on startup
w.transition(:init, { true => :up, false => :start }) do |on|
  on.condition(:process_running) do |c|
    c.running = true
    c.interval = 5
  end
end

# determine when process has finished starting
w.transition([:start, :restart], :up) do |on|
  on.condition(:process_running) do |c|
    c.running = true
    c.interval = 5.seconds
  end

  # failsafe
  on.condition(:tries) do |c|
    c.times = 5
    c.transition = :start
    c.interval = 5.seconds
  end
end

    # start if process is not running
    w.transition(:up, :start) do |on|
      on.condition(:process_running) do |c|
        c.running = false
      end
    end
  end
end

When I start god, while the workers ARE working, everything looks fine, and it runs all transitions to: up.

But when the workers are not running, it stops after starting. The workers are actually started, the pid-files are correct too. Only god doesn't get it:

 ** [out :: narg-wrk02] I [2012-08-23 11:40:48]  INFO: resque-7 move 'unmonitored' to 'init'
 ** [out :: narg-wrk02] I [2012-08-23 11:40:48]  INFO: resque-7 moved 'unmonitored' to 'init'
 ** [out :: narg-wrk02] I [2012-08-23 11:40:48]  INFO: resque-7 [trigger] process is not running (ProcessRunning)
 ** [out :: narg-wrk02] I [2012-08-23 11:40:48]  INFO: resque-7 move 'init' to 'start'
 ** [out :: narg-wrk02] I [2012-08-23 11:40:48]  INFO: resque-7 before_start: deleted pid 
 ** [out :: narg-wrk02] I [2012-08-23 11:40:48]  INFO: resque-7 start: cd /u/apps/narg/current ; bundle exec rake environment resque:work

As said before, the workers are started and running fine. Also the pid files contain the correct pid.

If I now kill god and restart it, it recognizes the running workers fine and transitions to :up..

Any ideas or pointers?

Remi Guan
  • 21,506
  • 17
  • 64
  • 87
Robin Kara
  • 21
  • 1

2 Answers2

0

Try changing

w.start = "cd #{current_path} ; bundle exec rake environment resque:work"

to

w.start = "cd #{current_path} ; bundle exec rake environment resque:work &"

This solved the same issue for me.

Ben Jackson
  • 860
  • 8
  • 11
0

I fixed the same issue with:

w.start = "cd #{current_path} && rake environment RAILS_ENV=production resque:work &"
aldrien.h
  • 3,437
  • 2
  • 30
  • 52