2

Kinda new to Rails, so please cope with me. What i'm doing now is background processing some Ruby code use Resque. To get the Rescque rake task started, I've been using (on heroku), I have a resque.rake file with that recommended code to attach into heroku's magical(or strange) threading architecture:

require "resque/tasks"
require 'resque_scheduler/tasks'

task "resque:setup" => :environment do
  ENV['QUEUE'] = '*'
end


desc "Alias for resque:work (To run workers on Heroku)"
task "jobs:work" => "resque:work"

Since I need access to the Rails code, I reference :environment. If I set at least 1 worker dyno in the background on heroku, my Resque does great, gets cleared, everything is happy. Until i try to automate stuff...

So I wanted to evolve the code and automatically fill the queue with relevant tasks every minute or so. Do that (without using cron, because heroku is not adequate with cron), I declare an initializer named task_scheduler.rb that uses Rufus scheduler to run tasks:

scheduler = Rufus::Scheduler.start_new

scheduler.in '5s' do
  autoprocessor_method
end

scheduler.every '1m' do
  autoprocessor_method
end

Things appear to work awesome for a while....then the rake process just stops picking up from the queue unexplainably. The queue just gets larger and larger. Even if i have multiple worker dynos running, they all eventually get tired and stop processing the queue. I'm not sure what I am doing wrong, but I suspect the referencing of the Rails environment in my rake task is causing the task_scheduler.rb code to run again, causing duplicate scheduling. I'm wondering how to solve that problem if someone knows, and I'm also curious if that is the reason for the rake task to stop working.

Thank you

haider
  • 2,418
  • 3
  • 24
  • 26

2 Answers2

5

You should not be booting the scheduler in an initializer, you should have a daemon process running the scheduler and filling up your queue. It would be something like this ("script/scheduler"):

#!/usr/bin/env ruby

root = File.expand_path(File.join(File.dirname(__FILE__), '..'))
Dir.chdir(root)

require 'rubygems'
gem 'daemons'
require 'daemons'

options = {
    :dir_mode   => :normal,
    :dir        => File.join(root, 'log'),
    :log_output => true,
    :backtrace  => true,
    :multiple   => false
}

Daemons.run_proc("scheduler", options) do

  Dir.chdir(root)
  require(File.join(root, 'config', 'environment'))

  scheduler = Rufus::Scheduler.start_new

  scheduler.in '5s' do
    autoprocessor_method
  end

  scheduler.every '1m' do
    autoprocessor_method
  end

end

And you can call this script as a usual daemon from your app:

script/scheduler start

This is going to make sure you have only one process sending work for the resque workers instead of one for each mongrel that you're running.

Maurício Linhares
  • 39,901
  • 14
  • 121
  • 158
  • thank you for the quick response mauricio. Quick followup, if I called this scheduler script from an initializer, will I run into the same problem of having a my resque rake task calling the scheduler and spawning yet another process? If so, where would recommend I call the scheduler script? I need it to run on startup. – haider Jul 24 '11 at 00:54
  • Can't you set this up as a external process? A usual linux daemon? I don't really know how Heroku works. – Maurício Linhares Jul 24 '11 at 04:36
  • And you should not call this script in an initializer, as you might have the same issue you had before, as its going to create many daemons instead of one. – Maurício Linhares Jul 24 '11 at 04:59
  • thanks for your help Mauricio. If I could set up an external process on heroku, all this would have been easier. But since I can't I'm forced to use these other means. I'm considering moving off of heroku because of this lack of control over background processing, and control over the DB. Although you're answer works in the non-heroku world, I'm going to post another answer for those folks who have this problem and are running on heroku. Thanks again! – haider Jul 24 '11 at 09:37
3

First of all, if you are not running on Heroku, i would not recommend this approach. I'd look at Mauricio's answer, or consider using a classic cron job or using Whenever to schedule the cron job.

But if you are in the pain of running on heroku and trying to do this, here is how i got this to work.

I kept the same original Resque.rake code in place, as i pasted in the original question. In addition, i created another rake task that i attached to the jobs:work rake process, just like the first case:

desc "Scheduler processor"
  task :scheduler => :environment do
  autoprocess_method
  scheduler = Rufus::Scheduler.start_new
  scheduler.every '1m' do
     twitter_autoprocess
  end
end

desc "Alias for resque:work (To run workers on Heroku)"
task "jobs:work" => "scheduler"

Couple of notes:

  1. This will be imperfect once you use more than one worker dyno because the scheduler will run in more than one spot. you can solve that by saving state somewhere, but its not as clean as I would like.
  2. I found the original reason why the process would hang. It was this line of code:

    scheduler.in '5s' do
     autoprocessor_method
    end
    

    I'm not sure why, but when I removed that, it never hung again.

haider
  • 2,418
  • 3
  • 24
  • 26
  • The part of it not hanging again turned out to be not true. HMMM. I'm beginning to wonder if this is a problem in my code or in Resque. Does anyone have thoughts? The Resque process works fine several times, until it finally stops responding to queued tasks. Weird. – haider Jul 25 '11 at 03:14