5

I created a big array a, whose memory grew to ~500 MB:

a = []

t = Thread.new do 
  loop do
    sleep 1
    print "#{a.size} "
  end
end

5_000_000.times do
  a << [rand(36**10).to_s(36)]
end

puts "\n size is #{a.size}"
a = []

t.join

After that, I "cleared" a, but the allocated memory didn't change until I killed the process. Is there something special I need to do to remove all these data which were assigned to a from the memory?

sawa
  • 165,429
  • 45
  • 277
  • 381
evfwcqcg
  • 15,755
  • 15
  • 55
  • 72
  • 1
    Do you have a real use case where this is a problem? How are you measuring memory usage exactly? Is it commit, is it memory that is unused but waiting to be reclaimed by the operating system when needed? – Ed S. Aug 11 '12 at 07:17

2 Answers2

4

You can call GC.start(), but you might not want to. See for example: Ruby garbage collect for a discussion here on Stack Overflow. Basically, I'd let the garbage collector decide for itself when to run unless you have a compelling reason to force it.

Community
  • 1
  • 1
Darshan Rivka Whittle
  • 32,989
  • 7
  • 91
  • 109
4

If I use the Ruby Garbage Collection Profiler on a lightly modified version of your code:

GC::Profiler.enable
GC::Profiler.clear

a = []
5_000_000.times do
  a << [rand(36**10).to_s(36)]
end

puts "\n size is #{a.size}"
a = []

GC::Profiler.report

I get the following output (on Ruby 1.9.3)(some columns and rows removed):

GC 60 invokes.
Index    Invoke Time(sec)       Use Size(byte)     Total Size(byte)     ...
    1               0.109               131136               409200     ...
    2               0.125               192528               409200     ...
  ...
   58              33.484            199150344            260938656     ...
   59              36.000            211394640            260955024     ...

The profile starts with 131 136 bytes used, and ends with 211 394 640 bytes used, without decreasing in size anywhere in the run, we can assume that no garbage collection has taken place.

If I then add a line of code which adds a single element to the array a, placed after a has grown to 5 million elements, and then has an empty array assigned to it:

GC::Profiler.enable
GC::Profiler.clear

a = []
5_000_000.times do
  a << [rand(36**10).to_s(36)]
end

puts "\n size is #{a.size}"
a = []

# the only change is to add one element to the (now) empty array a
a << [rand(36**10).to_s(36)]

GC::Profiler.report

This changes the profiler output to (some columns and rows removed):

GC 62 invokes.
Index    Invoke Time(sec)       Use Size(byte)     Total Size(byte)     ...
    1               0.156               131376               409200     ...
    2               0.172               192792               409200     ...
  ...
   59              35.375            211187736            260955024     ...
   60              36.625            211395000            469679760     ...
   61              41.891              2280168            307832976     ...

This profiler run now starts with 131 376 bytes used, which is similar to the previous run, grows, but ends with 2 280 168 bytes used, significantly lower than the previous profile run that ended with 211 394 640 bytes used, we can assume that garbage collection took place this during this run, probably triggered by our new line of code that adds an element to a.

The short answer is no, you don't need to do anything special to remove the data that was assigned to a, but hopefully this gives you the tools to prove it.

tsundoku
  • 1,340
  • 1
  • 19
  • 24