5

I need to group list of tuples in some unique way.

For example, if I have

val l = List((1,2,3),(4,2,5),(2,3,3),(10,3,2))

Then I should group the list with second value and map with the set of first value

So the result should be

Map(2 -> Set(1,4), 3 -> Set(2,10))

By so far, I came up with this

l groupBy { p => p._2 } mapValues { v => (v map { vv => vv._1 }).toSet } 

This works, but I believe there should be a much more efficient way...

youngwoong
  • 95
  • 2
  • 6

1 Answers1

3

This is similar to this question. Basically, as @serejja said, your approach is correct and also the most concise one. You could use collection.breakOut as builder factory argument to the last map and thereby save the additional iteration to get the Set type:

l.groupBy(_._2).mapValues(_.map(_._1)(collection.breakOut): Set[Int])

You shouldn't probably go beyond this, unless you really need to squeeze the performance.


Otherwise, this is how a general toMultiMap function could look like which allows you to control the values collection type:

import collection.generic.CanBuildFrom
import collection.mutable

def toMultiMap[A, K, V, Values](xs: TraversableOnce[A])
    (key: A => K)(value: A => V)
    (implicit cbfv: CanBuildFrom[Nothing, V, Values]): Map[K, Values] = {
  val b = mutable.Map.empty[K, mutable.Builder[V, Values]]
  xs.foreach { elem =>
    b.getOrElseUpdate(key(elem), cbfv()) += value(elem)
  }
  b.map { case (k, vb) => (k, vb.result()) } (collection.breakOut)
}

What it does is, it uses a mutable Map during building stage, and values gathered in a mutable Builder first (the builder is provided by the CanBuildFrom instance). After the iteration over all input elements has completed, that mutable map of builder values is converted into an immutable map of the values collection type (again using the collection.breakOut trick to get the desired output collection straight away).

Ex:

val l                    = List((1,2,3),(4,2,5),(2,3,3),(10,3,2))
val v                    = toMultiMap(l)(_._2)(_._1)  // uses Vector for values
val s: Map[Int, Set[Int] = toMultiMap(l)(_._2)(_._1)  // uses Set for values

So your annotated result type directs the type inference of the values type. If you do not annotate the result, Scala will pick Vector as default collection type.

Community
  • 1
  • 1
0__
  • 66,707
  • 21
  • 171
  • 266