32

I have about six Sidekiq worker which perform JSON crawling. Dependent on the endpoint's dataset size they finish between 1min and 4h. Especially, watching the long one, which takes 4h, I see a very slight increase of memory over time.

It's not a problem, until I want to schedule the same worker jobs again. The memory is not deallocated and stacks up, until I run into the Linux OOM Killer which gets rid of my Sidekiq process.

Memory leak? I watched the number of different objects in ObjectSpace:

ObjectSpace.each_object.inject(Hash.new(0)) { |count, o| count[o.class] += 1 }

There is not really an increase there, the set of hashes, arrays, etc. stays the same, short increases are swept away by the Garbage Collector and gc.stat[:count] tells me, that the Garbage Collector is working, too.

Even after the worker finishes, e.g. I get the [Done] logged and no workers are busy any more, the memory is not deallocated. What are the reasons for that? Can I do something against this? Write a finalizer?

The only current solution: Restart the Sidekiq process.

I am on Ruby 2.0.0 and use Ruby MRI.


For the JSON parsing I use Yajl, thus a C binding. I need it because it seems the only fast JSON parser that properly implements streamed reading and writing.

Guarana Joe
  • 723
  • 1
  • 7
  • 16
  • 1
    What gem are you using to parse the input JSON? Are you using any other gems with C-extensions? It sounds from your description (memory usage grows, but count of Ruby objects is constant) like you might have a leak from a gem with a C-extension (e.g. some gem is allocating memory that is not used to store a Ruby object, and never freeing it). – grumbler Sep 28 '13 at 21:07
  • 4
    It's also possible that you have a pure Ruby 'leak' where you're mutating an object repeatedly and causing it to grow in size without allocating new objects. For example, appending repeatedly to a Ruby String would cause it to continually consume more memory, without bumping your object count. – grumbler Sep 28 '13 at 21:21
  • 1
    @grumbler Oh good point. I extended my question. I use Yajl for the JSON parsing, which is indeed a C binding. Never thought about this. – Guarana Joe Sep 29 '13 at 12:00
  • What type of memory are you measuring? Active memory, total memory, ..? I have a sidekiq that takes a long time to decrease the total memory usage, but eventually it does, but you probably already tried waiting... – Jan Segre Oct 04 '13 at 04:01
  • @JanSegre Yes, well. Just the object space, I also see that `GC` has a nice `Profiler` submodule. I should make use for this one, too. Also I probably should get Valgrind running to investigate what this is about, seems to be my best match if it's really comming through the C extension. – Guarana Joe Oct 04 '13 at 12:04
  • 2
    If I have understood correctly, Ruby will never free the memory it has once reserved, not even during garbage collection. The memory "freed" by garbage collection is actually just marked as being safe to reuse by that same ruby process. Also if your JSON parser is converting the json into hash with symbolized keys, none of the symbols will ever be garbage collected, symbols are forever. – Kimmo Lehto Nov 17 '13 at 17:57
  • I had the same issue and I'm just restarting the sidekiq process. – jspooner Dec 04 '13 at 21:10
  • This is actually a known problem it seems, even mentioning MRI. See the link in my answer, on Mike Perham's blog. – digitalextremist Dec 06 '13 at 12:51
  • 1
    Since Ruby 2.2 symbols are Garbage Collected: https://www.ruby-lang.org/en/news/2014/12/25/ruby-2-2-0-released/ Also Ruby actually frees memory back to OS: https://github.com/ruby/ruby/blob/d94ea30a576136d93b122b6eb69971b18be927d5/gc.c#L1459-L1462 – Psylone Oct 31 '16 at 10:49

1 Answers1

13

Mike Perham who wrote Sidekiq addressed this here: http://www.mikeperham.com/2009/05/25/memory-hungry-ruby-daemons/

tl;dr version: MRI will not give the memory back, the most you can do is control the heap, and to do that, Ruby Enterprise Edition was suggested.

Don't know that any of this helps, but that is the situation - straight from the horse's mouth.

digitalextremist
  • 5,952
  • 3
  • 43
  • 62