5

I don't want active job to drop jobs when they fail. I want to have a chance to fix the failure and then let them re-run. I tried doing this:

class ApplicationJob < ActiveJob::Base
  retry_on Exception, attempts: Float::INFINITY
end

but it didn't work. An email job failed and was just discarded. I'm using delayed_job as the implementation.

Any ideas how to do it?

Pablo Fernandez
  • 279,434
  • 135
  • 377
  • 622

4 Answers4

9

If you are using Delayed::Job, you end up with two retrying mechanisms on top of each other. Active Job, the Rails general implementation, and the Delayed::Job.

For Active::Job you can do:

class ApplicationJob < ActiveJob::Base
  retry_on Exception, wait: :exponentially_longer, attempts: Float::INFINITY
end

Without the wait: :exponentially_longer, you may end up with a lot of jobs trying every 3 seconds.

The behavior of this retrying method can be a bit odd if you are using Delayed::Job. The job runs and seems to succeed but since it fails, ActiveJob creates a new one to run at a later time. Because of that the field attempts in Delayed::Job remains at 0 and you need to look at the field handler to see how many times it was run.

One ActiveJob fails for the last time, the exception bubbles up to Delayed::Job which has its own retrying mechanism. Delayed::Job by default retries 25 times and then deletes the job.

To make Delayed Job keep on trying forever, you can create an initialization file config/initializers/delayed_job_config.rb changing the max_attempts values:

Delayed::Worker.max_attempts = Float::INFINITY

If you are worried about losing jobs though, the jobs can fail and not be removed by setting:

Delayed::Worker.destroy_failed_jobs = false

which of the two you use, or how you mix them is up to you. Using Delayed::Job's makes the database make a bit more sense, using ActiveJob's means the method can be transported to other implementations.

Pablo Fernandez
  • 279,434
  • 135
  • 377
  • 622
eux
  • 3,072
  • 5
  • 14
2

retry_on exception, attempts: :unlimited (Rails 7.0+)

Starting from Rails 7.0, ActiveJob supports an ability to pass attempts: :unlimited to retry_on method:

:attempts - Re-enqueues the job the specified number of times (default: 5 attempts) or a symbol reference of :unlimited to retry the job until it succeeds

For example:

class RemoteServiceJob < ActiveJob::Base
  # ...
  retry_on CustomInfrastructureException, wait: 5.minutes, attempts: :unlimited

  def perform(*args)
    # ...
  end
end

Sources:

Marian13
  • 7,740
  • 2
  • 47
  • 51
0

We can implement own retry logic with passing block to retry_on following retry_on Documentation

may have its own retry mechanism or place it in a holding queue for inspection.

You can also pass a block that'll be invoked if the retry attempts fail for custom logic rather than letting the exception bubble up. This block is yielded with the job instance as the first and the error instance as the second parameter.

retry_on Exception do |job, error|
  MyJob.perform_later(job)
end

Working example of infinite retry:

# test_job.rb

require 'active_record'
require 'active_support'
require 'active_job'
require 'globalid'

ActiveJob::Base.queue_adapter = :async
GlobalID.app = 'app'
logger = ActiveJob::Base.logger

class ProcessPhotoJob < ActiveJob::Base
  retry_on ActiveRecord::RecordNotFound do |job, error|
    logger.info " retrying job #{job}"
    ProcessPhotoJob.perform_later(job)
  end

  def perform
    logger.info ' performing, but getting error:'
    raise ActiveRecord::RecordNotFound
  end
end

ProcessPhotoJob.perform_later

while true
  sleep 1
end

Which can be run with:

ruby test_job.rb
itsnikolay
  • 17,415
  • 4
  • 65
  • 64
0

This should work

retry_on Exception, wait: 5.minutes, attempts: :unlimited

https://edgeapi.rubyonrails.org/classes/ActiveJob/Exceptions/ClassMethods.html

Tmar
  • 23
  • 4