107

I have an array of hashes like so:

 [{"testPARAM1"=>"testVAL1"}, {"testPARAM2"=>"testVAL2"}]

And I'm trying to map this onto single hash like this:

{"testPARAM2"=>"testVAL2", "testPARAM1"=>"testVAL1"}

I have achieved it using

  par={}
  mitem["params"].each { |h| h.each {|k,v| par[k]=v} } 

But I was wondering if it's possible to do this in a more idiomatic way (preferably without using a local variable).

How can I do this?

Bart Platak
  • 4,387
  • 5
  • 27
  • 47

5 Answers5

189

You could compose Enumerable#reduce and Hash#merge to accomplish what you want.

input = [{"testPARAM1"=>"testVAL1"}, {"testPARAM2"=>"testVAL2"}]
input.reduce({}, :merge)
  is {"testPARAM2"=>"testVAL2", "testPARAM1"=>"testVAL1"}

Reducing an array sort of like sticking a method call between each element of it.

For example [1, 2, 3].reduce(0, :+) is like saying 0 + 1 + 2 + 3 and gives 6.

In our case we do something similar, but with the merge function, which merges two hashes.

[{:a => 1}, {:b => 2}, {:c => 3}].reduce({}, :merge)
  is {}.merge({:a => 1}.merge({:b => 2}.merge({:c => 3})))
  is {:a => 1, :b => 2, :c => 3}
cjhveal
  • 5,668
  • 2
  • 28
  • 38
  • 48
    input.reduce(&:merge) is sufficient. – redgetan Dec 15 '14 at 15:31
  • 1
    @redgetan is that any different from `input.reduce(:merge)`? – David van Geest May 12 '15 at 14:13
  • 2
    @David van Geest: In this case they are equivalent. The unary ampersand as used here builds a block out of the symbol. However, reduce has a special case that accepts a symbol. I wanted to avoid the unary ampersand operator to simplify the example, but redgetan is correct that the initial value is optional in this case. – cjhveal Jul 14 '15 at 22:55
  • 2
    Note that if you use `merge!` instead of `merge` it will modify the first hash (which you may not want) but will not create an intermediary hash for each new merge. – Phrogz Jan 31 '16 at 21:41
  • @Phrogz great point. In the example I gave in the answer above, it would only mutate the empty hash passed through as an initial value, which is exactly what we'd want. I think it'd be strictly more performant with the same semantics. – cjhveal Feb 01 '16 at 02:17
  • If you care about memory (say you want to transfer a huge dictionary), solutions by @redgetan and the OP cjhveal create a copy of that, while `merge!` or `reduce!` (among others with a bang) operate [in situ](https://en.wikipedia.org/wiki/In_situ#Computer_science) (consider this an extension of Phrogz' explanation). This may or may not be desired, so both variants exist for a reason. Instead of `reduce` you could also use `each_with_object({})` ([ruby >= 1.9.1.378](https://apidock.com/ruby/v1_9_1_378/Enumerable/each_with_object)) - RuboCop tells me that this is preferred. – Cadoiz Aug 09 '23 at 06:20
57

How about:

h = [{"testPARAM1"=>"testVAL1"}, {"testPARAM2"=>"testVAL2"}]
r = h.inject(:merge)
shigeya
  • 4,862
  • 3
  • 32
  • 33
  • 1
    This scheme is effectively same as what Joshua answered, but repeatedly applying #merge (method name passed as a symbol) on all of the hashes (think of inject as injecting a operator between items). Refer to [#inject](http://rdoc.info/stdlib/core/Enumerable%3ainject). – shigeya Aug 08 '12 at 01:56
  • 2
    How come we don't need the ampersand, as in h.inject(&:merge) ? – Donato Jun 03 '15 at 22:05
  • 7
    Because inject method accepts a symbol as a parameter to be interpreted as method name too. It's inject's feature. – shigeya Jun 26 '15 at 08:25
16

Every answers until now are advising to use Enumerable#reduce (or inject which is an alias) + Hash#merge but beware, while being clean, concise and human readable this solution will be hugely time consuming and have a large memory footprint on large arrays.

I have compiled different solutions and benchmarked them.

Some options

a = [{'a' => {'x' => 1}}, {'b' => {'x' => 2}}]

# to_h
a.to_h { |h| [h.keys.first, h.values.first] }

# each_with_object
a.each_with_object({}) { |x, h| h.store(x.keys.first, x.values.first) }
# each_with_object (nested)
a.each_with_object({}) { |x, h| x.each { |k, v| h.store(k, v) } }
# map.with_object
a.map.with_object({}) { |x, h| h.store(x.keys.first, x.values.first) }
# map.with_object (nested)
a.map.with_object({}) { |x, h| x.each { |k, v| h.store(k, v) } }

# reduce + merge
a.reduce(:merge) # take wayyyyyy to much time on large arrays because Hash#merge creates a new hash on each iteration
# reduce + merge!
a.reduce(:merge!) # will modify a in an unexpected way

Benchmark script

It's important to use bmbm and not bm to avoid differences are due to the cost of memory allocation and garbage collection.

require 'benchmark'

a = (1..50_000).map { |x| { "a#{x}" => { 'x' => x } } }

Benchmark.bmbm do |x|
  x.report('to_h:') { a.to_h { |h| [h.keys.first, h.values.first] } }
  x.report('each_with_object:') { a.each_with_object({}) { |x, h| h.store(x.keys.first, x.values.first) } }
  x.report('each_with_object (nested):') { a.each_with_object({}) { |x, h| x.each { |k, v| h.store(k, v) } } }
  x.report('map.with_object:') { a.map.with_object({}) { |x, h| h.store(x.keys.first, x.values.first) } }
  x.report('map.with_object (nested):') { a.map.with_object({}) { |x, h| x.each { |k, v| h.store(k, v) } } }
  x.report('reduce + merge:') { a.reduce(:merge) }
  x.report('reduce + merge!:') { a.reduce(:merge!) }
end

Note: I initially tested with a 1_000_000 items array but as reduce + merge is costing exponentially much time it will take to much time to end.

Benchmark results

50k items array

Rehearsal --------------------------------------------------------------
to_h:                        0.031464   0.004003   0.035467 (  0.035644)
each_with_object:            0.018782   0.003025   0.021807 (  0.021978)
each_with_object (nested):   0.018848   0.000000   0.018848 (  0.018973)
map.with_object:             0.022634   0.000000   0.022634 (  0.022777)
map.with_object (nested):    0.020958   0.000222   0.021180 (  0.021325)
reduce + merge:              9.409533   0.222870   9.632403 (  9.713789)
reduce + merge!:             0.008547   0.000000   0.008547 (  0.008627)
----------------------------------------------------- total: 9.760886sec

                                 user     system      total        real
to_h:                        0.019744   0.000000   0.019744 (  0.019851)
each_with_object:            0.018324   0.000000   0.018324 (  0.018395)
each_with_object (nested):   0.029053   0.000000   0.029053 (  0.029251)
map.with_object:             0.021635   0.000000   0.021635 (  0.021782)
map.with_object (nested):    0.028842   0.000005   0.028847 (  0.029046)
reduce + merge:             17.331742   6.387505  23.719247 ( 23.925125)
reduce + merge!:             0.008255   0.000395   0.008650 (  0.008681)

2M items array (excluding reduce + merge)

Rehearsal --------------------------------------------------------------
to_h:                        2.036005   0.062571   2.098576 (  2.116110)
each_with_object:            1.241308   0.023036   1.264344 (  1.273338)
each_with_object (nested):   1.126841   0.039636   1.166477 (  1.173382)
map.with_object:             2.208696   0.026286   2.234982 (  2.252559)
map.with_object (nested):    1.238949   0.023128   1.262077 (  1.270945)
reduce + merge!:             0.777382   0.013279   0.790661 (  0.797180)
----------------------------------------------------- total: 8.817117sec

                                 user     system      total        real
to_h:                        1.237030   0.000000   1.237030 (  1.247476)
each_with_object:            1.361288   0.016369   1.377657 (  1.388984)
each_with_object (nested):   1.765759   0.000000   1.765759 (  1.776274)
map.with_object:             1.439949   0.029580   1.469529 (  1.481832)
map.with_object (nested):    2.016688   0.019809   2.036497 (  2.051029)
reduce + merge!:             0.788528   0.000000   0.788528 (  0.794186)
Cadoiz
  • 1,446
  • 21
  • 31
noraj
  • 3,964
  • 1
  • 30
  • 38
  • Why would you exclude the "cost of memory allocation and garbage collection"? Isn't this something you would have irl too? Is `reduce` +`merge!` still the best then? I did some experiments with your code and found no substantial difference between `bm` and `bmbm` - but I can support your results. A different note: RuboCop tells me that I should prefer `each_with_object` which is more then twice as slow here. But thanks for your comparison, this is what I was looking for. [This solution using Ruby 3 `merge`](https://stackoverflow.com/a/76636416/4575793) could also be considered. – Cadoiz Aug 09 '23 at 06:39
9

Use #inject

hashes = [{"testPARAM1"=>"testVAL1"}, {"testPARAM2"=>"testVAL2"}]
merged = hashes.inject({}) { |aggregate, hash| aggregate.merge hash }
merged # => {"testPARAM1"=>"testVAL1", "testPARAM2"=>"testVAL2"}
Joshua Cheek
  • 30,436
  • 16
  • 74
  • 83
1

2023 update:

The merge method in Ruby 3 now supports multiple arguments. Meaning, we can provide multiple hashes and they will be merged together.

{a:1}.merge({b:2}, {c:3})
=> {:a=>1, :b=>2, :c=>3}

If you have an array of hashes you can use the splat operator to spread the arguments:

hashes_to_merge = [{b:2}, {c:3}]
{a:1}.merge(*hashes_to_merge)
=> {:a=>1, :b=>2, :c=>3}
dombesz
  • 7,890
  • 5
  • 38
  • 47