3

Let's assume I need to do a trivial task on every element of a Hash, e.g. increment its value by 1, or change value into an array containing that value. I've been doing it like this

hash.map{ |k, v| [k, v+1] }.to_h

v+1 is just an example, it can be anything.

Is there any cleaner way to do this? I don't really like mapping a hash to an array of 2-sized arrays, then remembering to convert it to hash again.

Example of what might be nicer:

hash.hash_map{ |v| v+1 }

This way some thing like string conversion (to_s) might be simplified to

hash.hash_map(&:to_s)

Duplication clarification: I'm not looking for Hash[...] or .to_h, I'm asking if anyone knows a more compact and cleaner solution.

JackWhiteIII
  • 1,388
  • 2
  • 11
  • 25
Piotr Kruczek
  • 2,384
  • 11
  • 18

2 Answers2

4

That's just the way Ruby's collection framework works. There is one map method in Enumerable which doesn't know anything about hashes or arrays or lists or sets or trees or streams or whatever else you may come up with. All it knows is that there is a method named each which will yield one single element per iteration. That's it.

Note that this is the same way the collections frameworks of Java and .NET work, too. All collections operations always return the same type: in .NET, that's IEnumerable, in Ruby, that's Array.

Another design approach is that collections operations are type-preserving, i.e. mapping a set will produce a set, etc. That's the way it is done in Smalltalk, for example. However, in Smalltalk, but there it is achieved by copy&pasting almost identical methods into each and every different collection. I.e. if you want to implement your own collection, in Ruby, you only have to implement each, and you get everything else for free, whereas in Smalltalk, you have to implement every single collection method separately. (In Ruby, that would be over 40 methods.)

Scala is the first language that managed to provide a collections framework with type-preserving operations without code duplication, but it took until Scala 2.8 (released in 2010) to figure that out. (The key is the idea of collection builders.) Ruby's collections library was designed in 1993, 17 years before we had figured out how to do type-preserving collections operations without code duplication. Plus, Scala depends heavily on its sophisticated static type system and type-level metaprogramming to find the correct collection builder at compile time. This is not necessary for the scheme to work, but having to look up the builder for every operation at runtime may incur a hefty runtime cost.

What you could do is add new methods that are not part of the standard Enumerable protocol, for example similar to Scala's mapValues and mapKeys.

Jörg W Mittag
  • 363,080
  • 75
  • 446
  • 653
  • `Array` is [reimplementing](http://ruby-doc.org/core-2.2.2/Array.html#method-i-map) `#map`, I don’t see any problem in reimplementing `map` in `Hash`, besides that OP wants to add new behavioral feature, rather than to overload existing one. So, this is definitely about extending `Hash` class, not about overloading a default collection behavior. – Aleksei Matiushkin Jun 17 '15 at 11:42
  • Yes, there are a few overrides of the `Enumerable` methods, for performance reasons. However, they are honoring the contract of `Enumerable`. Adding something like `map_values` and `map_keys` is perfectly possible, all it requires is a feature request, and enough support by the community and the designers. – Jörg W Mittag Jun 17 '15 at 11:49
  • There is no way to add `map_{values,keys}` to `Enumerable`, since there is no notion of `keys` and `values` in `Enumerable`. That said, I would monkeypatch `Hash` class myself and use it on my own pleasure. Anyway, thanks for a great tour into collection contracts in different languages. – Aleksei Matiushkin Jun 17 '15 at 11:55
  • Thank you for showing how this works in detail, that's the explanation I've been looking for. – Piotr Kruczek Jun 17 '15 at 12:14
3

AFAIK, this does not exist in the Hash out of Ruby box, but here is a simple monkeypatch to achieve what you want:

▶ class Hash
▷   def hash_map &cb
▷     keys.zip(values.map(&cb)).to_h
▷   end  
▷ end

There are more readable ways to achieve the requested functionality, but this one uses the built-in map for values once, pretending to be the fastest implementation that comes into my mind.

▶ h = {a: 1, b: 2}
#⇒ { :a => 1, :b => 2 }
▶ h.hash_map do |v| v + 5 end
#⇒ { :a => 6, :b => 7 }
Aleksei Matiushkin
  • 119,336
  • 10
  • 100
  • 160
  • Why the explicit `self` ? Btw. here is another slightly faster solution: `def values_update inject([]) {|ac,(k,v)| ac << [k, yield(v)] }.to_h end`. Also processed only once, faster on my config CRuby 2.2.2 and JRuby 9.0.0.0pre2 on Linux x86_64. – David Unric Jun 17 '15 at 14:05
  • 1
    @DavidUnric It’s a matter of habit. I addicted to specify `self`with functions named in too wide fashion like `keys` and `values`. Silly addiction, though. Will remove them from an answer. I am surprised, that creating new `Array` instance is faster than `map`: it sounds like an implementation bug. BTW, your implementation should go even faster if it would create a hash in `inject`. – Aleksei Matiushkin Jun 17 '15 at 14:43
  • Good point, feeling silly now: `inject({}) {|ac,(k,v)| ac[k]=yield(v);ac }` – David Unric Jun 17 '15 at 15:20