2

Given a Seq of tuples like:

Seq(
  ("a",Set(1,2)),
  ("a",Set(2,3)),
  ("b",Set(4,6)),
  ("b",Set(5,6))
)

I would like to groupBy and then flatMap the values to obtain something like:

Map(
  b -> Set(4, 6, 5), 
  a -> Set(1, 2, 3)
)

My first implementation is:

Seq(
  ("a" -> Set(1,2)),
  ("a" -> Set(2,3)),
  ("b" -> Set(4,6)),
  ("b" -> Set(5,6))
) groupBy (_._1) mapValues (_ map (_._2)) mapValues (_.flatten.toSet)

I was wondering if there was a more efficient and maybe simpler way to achieve that result.

Filippo Vitale
  • 7,597
  • 3
  • 58
  • 64

3 Answers3

4

I would use foldLeft, I think it's more readable, you can avoid groupBy

val r = Seq(
    ("a",Set(1,2)),
    ("a",Set(2,3)),
    ("b",Set(4,6)),
    ("b",Set(5,6))
  ).foldLeft(Map[String, Set[Int]]()){
    case (seed,(k,v)) => {
      seed.updated(k,v ++ seed.getOrElse(k,Set[Int]()))
    }
  }
grotrianster
  • 2,468
  • 14
  • 14
4

You were on the right track, but you can simplify a bit by using a single mapValues and combining the map and flatten:

val r = Seq(
  ("a" -> Set(1,2)),
  ("a" -> Set(2,3)),
  ("b" -> Set(4,6)),
  ("b" -> Set(5,6))
).groupBy(_._1).mapValues(_.flatMap(_._2).toSet)

I actually find this a lot more readable than the foldLeft version (but note that mapValues returns a non-strict collection, which may or may not be what you want).

Community
  • 1
  • 1
Travis Brown
  • 138,631
  • 12
  • 375
  • 680
0

@grotrianster answer could be refined using the Semigroup binary operation |+| of Set and Map:

import scalaz.syntax.semigroup._
import scalaz.std.map._
import scalaz.std.set._

Seq(
  ("a",Set(1,2)),
  ("a",Set(2,3)),
  ("b",Set(4,6)),
  ("b",Set(5,6))
).foldLeft(Map[String, Set[Int]]()){case (seed, (k, v)) => seed |+| Map(k -> v)}

Using reduce instead of fold:

Seq(
  ("a", Set(1, 2)),
  ("a", Set(2, 3)),
  ("b", Set(4, 6)),
  ("b", Set(5, 6))
).map(Map(_)).reduce({_ |+| _})

Treating Set and Map as Monoids:

Seq(
  ("a", Set(1, 2)),
  ("a", Set(2, 3)),
  ("b", Set(4, 6)),
  ("b", Set(5, 6))
).map(Map(_)).toList.suml
Filippo Vitale
  • 7,597
  • 3
  • 58
  • 64