I've discovered a memory leak in my Rails code - that is to say, I've found what code leaks but not why it leaks. I've reduced it down to a test case that doesn't require Rails:
require 'csspool'
require 'ruby-mass'
def report
puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`.strip.split.map(&:to_i)[1].to_s + 'KB'
Mass.print
end
report
# note I do not store the return value here
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))
ObjectSpace.garbage_collect
sleep 1
report
ruby-mass supposedly lets me see all the objects in memory. CSSPool is a CSS parser based on racc. /home/jason/big.css is a 1.5MB CSS file.
This outputs:
Memory 9264KB
==================================================
Objects within [] namespace
==================================================
String: 7261
RubyVM::InstructionSequence: 1151
Array: 562
Class: 313
Regexp: 181
Proc: 111
Encoding: 99
Gem::StubSpecification: 66
Gem::StubSpecification::StubLine: 60
Gem::Version: 60
Module: 31
Hash: 29
Gem::Requirement: 25
RubyVM::Env: 11
Gem::Specification: 8
Float: 7
Gem::Dependency: 7
Range: 4
Bignum: 3
IO: 3
Mutex: 3
Time: 3
Object: 2
ARGF.class: 1
Binding: 1
Complex: 1
Data: 1
Gem::PathSupport: 1
IOError: 1
MatchData: 1
Monitor: 1
NoMemoryError: 1
Process::Status: 1
Random: 1
RubyVM: 1
SystemStackError: 1
Thread: 1
ThreadGroup: 1
fatal: 1
==================================================
Memory 258860KB
==================================================
Objects within [] namespace
==================================================
String: 7456
RubyVM::InstructionSequence: 1151
Array: 564
Class: 313
Regexp: 181
Proc: 113
Encoding: 99
Gem::StubSpecification: 66
Gem::StubSpecification::StubLine: 60
Gem::Version: 60
Module: 31
Hash: 30
Gem::Requirement: 25
RubyVM::Env: 13
Gem::Specification: 8
Float: 7
Gem::Dependency: 7
Range: 4
Bignum: 3
IO: 3
Mutex: 3
Time: 3
Object: 2
ARGF.class: 1
Binding: 1
Complex: 1
Data: 1
Gem::PathSupport: 1
IOError: 1
MatchData: 1
Monitor: 1
NoMemoryError: 1
Process::Status: 1
Random: 1
RubyVM: 1
SystemStackError: 1
Thread: 1
ThreadGroup: 1
fatal: 1
==================================================
You can see the memory going way up. Some of the counters go up, but no objects specific to CSSPool are present. I used ruby-mass's "index" method to inspect the objects that have references like so:
Mass.index.each do |k,v|
v.each do |id|
refs = Mass.references(Mass[id])
puts refs if !refs.empty?
end
end
But again, this doesn't give me anything related to CSSPool, just gem info and such.
I've also tried outputting "GC.stat"...
puts GC.stat
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))
ObjectSpace.garbage_collect
sleep 1
puts GC.stat
Result:
{:count=>4, :heap_used=>126, :heap_length=>138, :heap_increment=>12, :heap_live_num=>50924, :heap_free_num=>24595, :heap_final_num=>0, :total_allocated_object=>86030, :total_freed_object=>35106}
{:count=>16, :heap_used=>6039, :heap_length=>12933, :heap_increment=>3841, :heap_live_num=>13369, :heap_free_num=>2443302, :heap_final_num=>0, :total_allocated_object=>3771675, :total_freed_object=>3758306}
As I understand it, if an object is not referenced and garbage collection happens, then that object should be cleared from memory. But that doesn't seem to be what's happening here.
I've also read about C-level memory leaks, and since CSSPool uses Racc which uses C code, I think this is a possibility. I've run my code through Valgrind:
valgrind --partial-loads-ok=yes --undef-value-errors=no --leak-check=full --fullpath-after= ruby leak.rb 2> valgrind.txt
Results are here. I'm not sure if this confirms a C-level leak, as I've also read that Ruby does things with memory that Valgrind doesn't understand.
Versions used:
- Ruby 2.0.0-p247 (this is what my Rails app runs)
- Ruby 1.9.3-p392-ref (for testing with ruby-mass)
- ruby-mass 0.1.3
- CSSPool 4.0.0 from here
- CentOS 6.4 and Ubuntu 13.10