47

Ruby, since v1.9, supports a deterministic order when looping through a hash; entries added first will be returned first.

Does this apply to literals, i.e. will { a: 1, b: 2 } always yield a before b?

I did a quick experiment with Ruby 2.1 (MRI) and it was in fact consistent, but to what extent is this guaranteed by the language to work on all Ruby implementations?

double-beep
  • 5,031
  • 17
  • 33
  • 41
mahemoff
  • 44,526
  • 36
  • 160
  • 222
  • 2
    Readers, including me, are asking themselves: "what does the order of a hash's keys have to do with the types of objects they are?". – Cary Swoveland Jul 14 '15 at 22:59
  • 2
    `h = { a: 1, b: 2 }` is the same as `h = { }; h[:a] = 1; h[:b] = 2` so yes. Finding a specification that says that is another story. – mu is too short Jul 14 '15 at 23:02
  • @muistooshort That'a presumption you've made without citing any evidence. – mahemoff Jul 14 '15 at 23:30
  • Go find a Ruby specification and I'll point out the relevant section. – mu is too short Jul 14 '15 at 23:44
  • Sorry I meant the first part was a presumption about any given implementation ("h = { a: 1, b: 2 } is the same as h = { }; h[:a] = 1; h[:b] = 2"). I agree there's no formal spec afaik, closest thing is probably the MRI tests and any statements from project leadership. – mahemoff Jul 14 '15 at 23:46
  • I say that those two version are equivalent because nothing else would make any sense. There are even things that depend on that being true. End of the day, "Ruby" really means "whatever MRI does". – mu is too short Jul 14 '15 at 23:55
  • 1
    @muistooshort: You can easily test your assumption by monkey-patching `Hash` and replacing its `[]=` method with one that logs its execution, and you will see that the two forms are most definitely *not* equivalent. – Jörg W Mittag Jul 15 '15 at 07:45

2 Answers2

49

There are couple of locations where this could be specified, i.e. a couple of things that are considered "The Ruby Language Specification":

The ISO spec doesn't say anything about Hash ordering: it was written in such a way that all existing Ruby implementations are automatically compliant with it, without having to change, i.e. it was written to be descriptive of current Ruby implementations, not prescriptive. At the time the spec was written, those implementations included MRI, YARV, Rubinius, JRuby, IronRuby, MagLev, MacRuby, XRuby, Ruby.NET, Cardinal, tinyrb, RubyGoLightly, SmallRuby, BlueRuby, and others. Of particular interest are MRI (which only implements 1.8) and YARV (which only implements 1.9 (at the time)), which means that the spec can only specify behavior which is common to 1.8 and 1.9, which Hash ordering is not.

The RubySpec project was abandoned by its developers out of frustration that the ruby-core developers and YARV developers never recognized it. It does, however, (implicitly) specify that Hash literals are ordered left-to-right:

new_hash(1 => 2, 4 => 8, 2 => 4).keys.should == [1, 4, 2]

That's the spec for Hash#keys, however, the other specs test that Hash#values has the same order as Hash#keys, Hash#each_value and Hash#each_key has the same order as those, and Hash#each_pair and Hash#each have the same order as well.

I couldn't find anything in the YARV testsuite that specifies that ordering is preserved. In fact, I couldn't find anything at all about ordering in that testsuite, quite the opposite: the tests go to great length to avoid depending on ordering!

The Flanagan/matz book kinda-sorta implicitly specifies Hash literal ordering in section 9.5.3.6 Hash iterators. First, it uses much the same formulation as the docs:

In Ruby 1.9, however, hash elements are iterated in their insertion order, […]

But then it goes on:

[…], and that is the order shown in the following examples:

And in those examples, it actually uses a literal:

h = { :a=>1, :b=>2, :c=>3 }

# The each() iterator iterates [key,value] pairs
h.each {|pair| print pair }    # Prints "[:a, 1][:b, 2][:c, 3]"

# It also works with two block arguments
h.each do |key, value|                
  print "#{key}:#{value} "     # Prints "a:1 b:2 c:3" 
end

# Iterate over keys or values or both
h.each_key {|k| print k }      # Prints "abc"
h.each_value {|v| print v }    # Prints "123"
h.each_pair {|k,v| print k,v } # Prints "a1b2c3". Like each

In his comment, @mu is too short mentioned that

h = { a: 1, b: 2 } is the same as h = { }; h[:a] = 1; h[:b] = 2

and in another comment that

nothing else would make any sense

Unfortunately, that is not true:

module HashASETWithLogging
  def []=(key, value)
    puts "[]= was called with [#{key.inspect}] = #{value.inspect}"
    super
  end
end

class Hash
  prepend HashASETWithLogging
end

h = { a: 1, b: 2 }
# prints nothing

h = { }; h[:a] = 1; h[:b] = 2
# []= was called with [:a] = 1
# []= was called with [:b] = 2

So, depending on how you interpret that line from the book and depending on how "specification-ish" you judge that book, yes, ordering of literals is guaranteed.

Community
  • 1
  • 1
Jörg W Mittag
  • 363,080
  • 75
  • 446
  • 653
  • 1
    Thanks for this detailed answer. – mahemoff Jul 15 '15 at 09:40
  • 1
    They're functionally equivalent if you insist on picking nits. Hash literals are most likely handled in C in MRI to avoid the overhead of having to call `[]=` over and over again; they're equivalent as far as ordering goes and that's all that matters here. No other handling of hash literals makes any sense given that hashes are ordered. – mu is too short Jul 15 '15 at 17:36
22

From the documentation:

Hashes enumerate their values in the order that the corresponding keys were inserted.

frostmatthew
  • 3,260
  • 4
  • 40
  • 50
  • 3
    Yes, as the question mentions, but I'm referring to *literal* notation. – mahemoff Jul 14 '15 at 22:47
  • 3
    There is nothing different/special about literal notation. It's adding them (in the order provided) to the hash (thus will be enumerated in that same order), see https://github.com/ruby/ruby/blob/trunk/hash.c#L550-L633 – frostmatthew Jul 14 '15 at 23:02
  • 3
    That's one Ruby implementation. I'm not aware there are any tests for this. Upvoted for the code ref anyway. – mahemoff Jul 14 '15 at 23:26
  • 3
    Why would _literal_ notation make a difference? It's not like a literal notation creates a different kind of hash, it's still a hash. – the Tin Man Jul 14 '15 at 23:35
  • 10
    The Ruby parser has to parse the literal. While the most obvious way to do that is go from top-to-bottom, there could be any number of optimisation-related reasons why it could do so in a different order. Also, the Ruby 1.9 order rule applies to Ruby code; it doesn't necessarily apply to what Ruby's internals may do with a hash (ie even if Ruby inserts from top to bottom, it doesn't mean the order is preserved). – mahemoff Jul 14 '15 at 23:50
  • 1
    The particular implementation link is old and dead. Here is [the newer one](https://github.com/ruby/ruby/blob/master/hash.c#L1822). Search for *rb_hash_s_create* if it goes away. – mlt Nov 14 '20 at 05:00