4

I have a rake task that I need to run on Heroku in the background as a one off task. However the tasks is pretty large and I run into a 'Error R14 (Memory Quota Exceeded)` and was hoping I could get some tips on how I can avoid this.

Essentially the task looks at the Products table and find products that have no images Product.where(images: nil). The task then cycles through each entry; using product.url it opens a connection to a remote website (using Nokogiri) and pulls the images and some additional data. The images are resized using mini_magick and saved to an S3 Bucket using carrierwave.

I have about 39000 records that need processing, but after about 500 I get the Memory Quota Exceeded error and the task stops.

I can see why this is quite a memory intensive task, but I was wondering if anyone could point me in the right direction as to how I could clean up the memory after each record has been processed and saved (or even after every 100 records).

Alternatively/additionally is there a way to auto restart the Heroku task after it automatically terminates?

8bithero
  • 1,474
  • 1
  • 18
  • 23

1 Answers1

3

Is you iterate over each record, you could force the GC to start:

Products.where(images: nil).each_with_index do |image, index|
  if index % 100 == 0
    GC.start
  end
end
Graham Slick
  • 6,692
  • 9
  • 51
  • 87
  • Yeah! That looks like I step in the right direction. Thanks! So basically it would kick off the garbage collection every 100 records? Could this also be done using the `in_batches` instead of `each_with_index`. Maybe a bit out of scope, but do you have any tips on finding the optimal number or records before doing a GC? (i.e. articles, gems, etc) – 8bithero Sep 30 '17 at 14:34
  • Never tried with in_batches, but it don't see why is shouldn't work :-) To find the optimal number of records, I'd print GC.stats for different batch sizes and see the ouput. Found this article explaining GC.stats https://www.speedshop.co/2017/03/09/a-guide-to-gc-stat.html – Graham Slick Sep 30 '17 at 14:38