63

Just now I am surprised to learn that mapValues produces a view. The consequence is shown in the following example:

case class thing(id: Int)
val rand = new java.util.Random
val distribution = Map(thing(0) -> 0.5, thing(1) -> 0.5)
val perturbed = distribution mapValues { _ + 0.1 * rand.nextGaussian }
val sumProbs = perturbed.map{_._2}.sum
val newDistribution = perturbed mapValues { _ / sumProbs }

The idea is that I have a distribution, which is perturbed with some randomness then I renormalize it. The code actually fails in its original intention: since mapValues produces a view, _ + 0.1 * rand.nextGaussian is always re-evaluated whenever perturbed is used.

I am now doing something like distribution map { case (s, p) => (s, p + 0.1 * rand.nextGaussian) }, but that's just a little bit verbose. So the purpose of this question is:

  1. Remind people who are unaware of this fact.
  2. Look for reasons why they make mapValues output views.
  3. Whether there is an alternative method that produces concrete Map.
  4. Are there any other commonly-used collection methods that have this trap.

Thanks.

Kane
  • 1,314
  • 2
  • 9
  • 14

3 Answers3

41

There's a ticket about this, SI-4776 (by YT).

The commit that introduces it has this to say:

Following a suggestion of jrudolph, made filterKeys and mapValues transform abstract maps, and duplicated functionality for immutable maps. Moved transform and filterNot from immutable to general maps. Review by phaller.

I have not been able to find the original suggestion by jrudolph, but I assume it was done to make mapValues more efficient. Give the question, that may come as a surprise, but mapValues is more efficient if you are not likely to iterate over the values more than once.

As a work-around, one can do mapValues(...).view.force to produce a new Map.

Ben Reich
  • 16,222
  • 2
  • 38
  • 59
Daniel C. Sobral
  • 295,120
  • 86
  • 501
  • 681
  • 5
    nice, but still I wonder why mapValues doesn't return the view directly to make that more explicit? – Alois Cochard Feb 14 '13 at 20:14
  • 1
    @AloisCochard Yeah that's a good point. It's much better if return type is `view` so we are alerted what is going on there... – Kane Feb 14 '13 at 21:42
  • 3
    @AloisCochard As you can see in the ticket, that's precisely what I was asking for. It would have the further benefit of making a `force` method directly available. – Daniel C. Sobral Feb 14 '13 at 22:30
  • I don't see the reasoning behind this decision. Using a view would improve performance only if the number of invocations (of existing keys) is less than the number of entries; from my experience, this is hardly the case in most scenarios. Views have other advantages if the underlying data is mutable (see SQL), which is less preferable in Scala as it goes against the functional approach. – Eyal Roth Dec 05 '16 at 16:49
11

The scala doc say:

a map view which maps every key of this map to f(this(key)). The resulting map wraps the original map without copying any elements.

So this should be expected, but this scares me a lot, I'll have to review bunch of code tomorrow. I wasn't expecting a behavior like that :-(

Just an other workaround:

You can call toSeq to get a copy, and if you need it back to map toMap, but this unnecessary create objects, and have a performance implication over using map

One can relatively easy write, a mapValues which doesn't create a view, I'll do it tomorrow and post the code here if no one do it before me ;)

EDIT:

I found an easy way to 'force' the view, use '.map(identity)' after mapValues (so no need of implementing a specific function):

scala> val xs = Map("a" -> 1, "b" -> 2)
xs: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -> 1, b -> 2)

scala> val ys = xs.mapValues(_ + Random.nextInt).map(identity)
ys: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -> 1315230132, b -> 1614948101)

scala> ys
res7: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -> 1315230132, b -> 1614948101)

It's a shame the type returned isn't actually a view! othewise one would have been able to call 'force' ...

Ben Reich
  • 16,222
  • 2
  • 38
  • 59
Alois Cochard
  • 9,812
  • 2
  • 29
  • 30
  • 1
    running with Scala `2.12.0-M3`, it's not clear to me what `map(identity)` buys you: `Map("a" -> 1, "b" -> 2).mapValues(_ + Random.nextInt)` returns `scala.collection.immutable.Map[String,Int] = Map(a -> 1496073565, b -> -1842623900)`. Could you please elaborate? I *think* that the potential problem of `mapValues` is that it lazily evaluates values, but I'm not sure. Thanks – Kevin Meredith Dec 03 '15 at 16:26
  • 2
    this is crazily surprising. how can view's map returns a concrete map instead of a view? this is beyond inconsistency. this is an entire lack of consideration. confirm from the source code that this thing still exists in 2.12. – Jason Hu Jan 28 '18 at 00:10
1

Is better(and deprecated) in scala 2.13, now returns a MapView: API Doc

Frederick Roth
  • 2,748
  • 4
  • 28
  • 42