15

Does Enumerable#group_by preserve the original order within each value? When I get this:

[1, 2, 3, 4, 5].group_by{|i| i % 2}
# => {1=>[1, 3, 5], 0=>[2, 4]}

is it guaranteed that, for example, the array [1, 3, 5] contains the elements in this order and not, for example [3, 1, 5]?

Is there any description regarding this point?

I am not mentioning the order between the keys 1 and 0. That is a different issue.

Dave Schweisguth
  • 36,475
  • 10
  • 98
  • 121
sawa
  • 165,429
  • 45
  • 277
  • 381
  • `Enumerable` uses `each` to traverse the collection. Changing the order would require extra effort. – Stefan Jun 24 '14 at 06:40
  • But previously, I learned that the reminescent `Enumerable#sort` is not stable. So I couldn't be sure about it. – sawa Jun 24 '14 at 06:42

2 Answers2

17

Yes, Enumerable#group_by preserves input order.

Here's the implementation of that method in MRI, from https://github.com/ruby/ruby/blob/trunk/enum.c:

static VALUE
enum_group_by(VALUE obj)
{
    VALUE hash;

    RETURN_SIZED_ENUMERATOR(obj, 0, 0, enum_size);

    hash = rb_hash_new();
    rb_block_call(obj, id_each, 0, 0, group_by_i, hash);
    OBJ_INFECT(hash, obj);

    return hash;
}

static VALUE
group_by_i(RB_BLOCK_CALL_FUNC_ARGLIST(i, hash))
{
    VALUE group;
    VALUE values;

    ENUM_WANT_SVALUE();

    group = rb_yield(i);
    values = rb_hash_aref(hash, group);
    if (!RB_TYPE_P(values, T_ARRAY)) {
        values = rb_ary_new3(1, i);
        rb_hash_aset(hash, group, values);
    }
    else {
        rb_ary_push(values, i);
    }
    return Qnil;
}

enum_group_by calls group_by_i on each array (obj) element in order. group_by_i creates a one-element array (rb_ary_new3(1, i)) the first time a group is encountered, and pushes on to the array thereafter (rb_ary_push(values, i)). So the input order is preserved.

Also, RubySpec requires it. From https://github.com/rubyspec/rubyspec/blob/master/core/enumerable/group_by_spec.rb:

it "returns a hash with values grouped according to the block" do
  e = EnumerableSpecs::Numerous.new("foo", "bar", "baz")
  h = e.group_by { |word| word[0..0].to_sym }
  h.should == { :f => ["foo"], :b => ["bar", "baz"]}
end
Dave Schweisguth
  • 36,475
  • 10
  • 98
  • 121
  • This depends on the implementation of `each` which is mapped via `id_each` in the call to `rb_block_call` – Matt Jun 24 '14 at 07:41
8

More specifically, Enumerable calls each so it depends on how each is implemented and whether each yields the elements in the original order:

class ReverseArray < Array
  def each(&block)
    reverse_each(&block)
  end
end

array = ReverseArray.new([1,2,3,4])
#=> [1, 2, 3, 4]

array.group_by { |i| i % 2 }
#=> {0=>[4, 2], 1=>[3, 1]}
Stefan
  • 109,145
  • 14
  • 143
  • 218
  • Thanks for making clear that it calls `each`. `reverse_each` would not make sense if the order of `each` were arbitrary. Doesn't the existence of `reverse_each` imply that `each` preserves the order (unless it is overwritten by the user)? – sawa Jun 24 '14 at 08:19
  • 2
    It's just an example. `Array#each` of course yields the elements [in the same order](http://www.ruby-doc.org/core-2.1.2/Array.html#class-Array-label-Iterating+over+Arrays): *"In case of Array’s each, all elements in the Array instance are yielded to the supplied block in sequence."* But other classes could implement it differently, for example `Hash` in Ruby 1.8. Therefore, `group_by` can't guarantee any order, it depends entirely on `each`. – Stefan Jun 24 '14 at 08:28
  • 2
    so, in conclusion, `group_by` preserves order when dealing with arrays, but not necessarily in other structures. – Toby 1 Kenobi Dec 10 '18 at 01:00