You can use any value you want, as long as you use the same one.
x = "A string value"
h = Hash[ 10000.times.map{|i| [i, x]} ]
h2 = Hash[ 10000.times.map{|i| [i, nil]} ]
# h takes the same memory as h2
In the above example, x
can be anything you like. The values will only hold the pointer to x
, or the value itself if x
is an immediate value (nil
, true
, false
or a Fixnum
).
In either case, the memory used is the same! It will be the size of a pointer on your platform (i.e. 0.size
bytes). In the C code, this corresponds to a VALUE
.
Just be careful to reuse the same object (i.e. same object_id
) and not create a new object everytime. For example:
h3 = Hash[ 10000.times.map{|i| [i, "A string value"]} ]
# => h3 will take a lot more space!
h.values.map(&:object_id).uniq.size # => 1
h3.values.map(&:object_id).uniq.size # => 10000
In short, a surefire way is to use false
, true
, nil
, a Fixnum
or a Symbol
, since symbols are stored in a global table. :hello.object_id
is the same everywhere and the string 'hello'
is stored only once and shared for all the :hello
symbols in your code.
h4 = Hash[ 10000.times.map{|i| [i, :some_symbol]} ]
# => h4 will only take as much space as h and h2
h4.values.map(&:object_id).uniq.size # => 1
FYI, the built-in library Set
has the same requirement, i.e. it uses a Hash
only for the keys. It uses true
as the value, for simplicity's sake.