4

Came across the following weird behaviour in ruby 1.8.6, in 1.8.7 it seems to be working correctly. Does anyone know what would have caused this?

h = {}
key_1 = {1 => 2}
key_2 = {1 => 2}
h[key_1] = 3
p key_1 == key_2 # => true
p h.has_key?(key_2) # => expect true, get false, wtf?

I had thought that it would be caused by the implementation of the hash method on the Hash class.

p [key_1.hash, key_2.hash] # => [537787070, 537787060] (different)

but even if I override the hash method of Hash

class Hash
  def hash
    return self.keys.hash + self.values.hash
  end
end

p [key_1.hash, key_2.hash] # => [8,8] (same
p h.has_key?(key_2)        # => false

codepad link to online ruby 1.8.6 interpreter results: http://codepad.org/7nCYMP4w

Max
  • 21,123
  • 5
  • 49
  • 71
Jamie Cook
  • 4,375
  • 3
  • 42
  • 53
  • That code you are overriding is not doing what you think it does. When ruby accesses and hashes things it is using the C code from ruby itself. To prove this, try raising an exception in your overriding of #hash. It is not being called. – Michael Papile Feb 22 '11 at 08:36

2 Answers2

2

The answer is because in Ruby 1.8.6 the hash coding algorithm was broken for hash keys.

http://paulbarry.com/articles/2009/09/14/why-rails-3-will-require-ruby-1-8-7

Edit: Here is an example that shows that ruby does not call .hash internally:

 class Hash
    def hash
       raise
    end
 end

 {1=>1}.hash
 RuntimeError: 
from (irb):12:in `hash'
from (irb):17

 h = {1=>2}
 {1=>2}
 h[1]
 2

Ruby 1.8.6 is broken in this respect, and if there were a pure Ruby way to do it (such as opening Hash, people would do it. It was fixed in 1.8.7

Michael Papile
  • 6,836
  • 30
  • 30
  • so why doesn't fixing the hash method on Hash fix the problem? – Jamie Cook Feb 22 '11 at 08:33
  • Because it is not using ruby code to make its hash codes. See my above comment. This is from ruby 1.8.4 but it is what the code looks like http://www.ruby-doc.org/doxygen/1.8.4/hash_8c-source.html – Michael Papile Feb 22 '11 at 08:38
  • If you alter the hashing function in hash.c in the ruby source code, and recompile, you will see a difference but not when changing the .hash method for higher level Ruby use. – Michael Papile Feb 22 '11 at 08:40
  • 1
    Looks like it comes down to this definition [ #define do_hash (key,table) (unsigned int)(*(table)->type->hash)((key)) ] So does this mean that you can't change the internal C representation of member function pointers? – Jamie Cook Feb 22 '11 at 08:52
  • I think that must have changed then: in 1.8.7 if I change the hash method of Hash to return Time.now.hash then it exhibits the same behaviour as 1.8.6 -> this means that my opening the Hash class and changing the hash method actually effects that internal representation. This is probably a good argument to change to 1.8.7 :) – Jamie Cook Feb 22 '11 at 08:59
  • Even better yet switch to 1.9.2 :) I was not sold on 1.9 for a while, but now it is too compelling not to switch. – Michael Papile Feb 22 '11 at 09:23
1

This is fixed in 1.8.7+ but you can monkey patch 1.8.6 to do it right, too ex: https://github.com/rdp/sane/blob/master/lib/sane/hash_hashes.rb

rogerdpack
  • 62,887
  • 36
  • 269
  • 388