14

What is the best way to turn a Map[A, Set[B]] into a Map[B, Set[A]]?

For example, how do I turn a

Map(1 -> Set("a", "b"),
    2 -> Set("b", "c"),
    3 -> Set("c", "d"))

into a

Map("a" -> Set(1),
    "b" -> Set(1, 2),
    "c" -> Set(2, 3),
    "d" -> Set(3))

(I'm using immutable collections only here. And my real problem has nothing to do with strings or integers. :)

aioobe
  • 413,195
  • 112
  • 811
  • 826
  • Best in what way? :) I found the preferred solutions to be much slower than mine. – hbatista Mar 31 '11 at 12:40
  • 1
    Ah, right. Until I have a complete system, I always prefer conciseness and clarity over performance. (And I'm writing a full compiler, so I doubt this will be the bottle neck :) – aioobe Mar 31 '11 at 13:02
  • Perfectly sound approach, but I find that sometimes conciseness and clarity don't go hand in hand... :) – hbatista Mar 31 '11 at 17:26

6 Answers6

10

with help from aioobe and Moritz:

def reverse[A, B](m: Map[A, Set[B]]) =
  m.values.toSet.flatten.map(v => (v, m.keys.filter(m(_)(v)))).toMap

It's a bit more readable if you explicitly call contains:

def reverse[A, B](m: Map[A, Set[B]]) =
  m.values.toSet.flatten.map(v => (v, m.keys.filter(m(_).contains(v)))).toMap
Seth Tisue
  • 29,985
  • 11
  • 82
  • 149
  • heh. aioobe's solution is nearly identical – Seth Tisue Mar 31 '11 at 12:13
  • Nice. I think we got a winner :D May I suggest changing `k => ...` into just `m(_).contains(v)` :-) – aioobe Mar 31 '11 at 12:15
  • Also, is `.distinct` really needed? Isn't that step taken care of in `.toMap`? After removing `.distinct` I think one could get rid of `.toSeq` as well, or am I missing something? – aioobe Mar 31 '11 at 12:17
  • @aiiobe: I edited this to incorporate your m(_) change. As for the toSeq.distinct part, you're right, it isn't strictly necessary. But I figured it was better to discard the duplicates early in the computation. – Seth Tisue Mar 31 '11 at 12:28
  • 2
    @aioobe then you could go even further and write it as `m.values.toSet.flatten.map(v => (v, m.keys.filter(m(_)(v)))).toMap` - I think that is as short as it can get (as `Set[A] <: (A => Boolean)`). – Moritz Mar 31 '11 at 12:35
  • @Moritz, aah, hah. Cool :-) That's what I'm using now :-) Quite hard to read now though ;-)) – aioobe Mar 31 '11 at 12:55
3

Best I've come up with so far is

val intToStrs = Map(1 -> Set("a", "b"),
                    2 -> Set("b", "c"),
                    3 -> Set("c", "d"))

def mappingFor(key: String) =
    intToStrs.keys.filter(intToStrs(_) contains key).toSet

val newKeys = intToStrs.values.flatten
val inverseMap = newKeys.map(newKey => (newKey -> mappingFor(newKey))).toMap
aioobe
  • 413,195
  • 112
  • 811
  • 826
3

Or another one using folds:

  def reverse2[A,B](m:Map[A,Set[B]])=
      m.foldLeft(Map[B,Set[A]]()){case (r,(k,s)) =>
         s.foldLeft(r){case (r,e)=>
            r + (e -> (r.getOrElse(e, Set()) + k))
         }
      }
hbatista
  • 1,207
  • 9
  • 12
3

Here's a one statement solution

 orginalMap
 .map{case (k, v)=>value.map{v2=>(v2,k)}}
 .flatten
 .groupBy{_._1}
 .transform {(k, v)=>v.unzip._2.toSet}

This bit rather neatly (*) produces the tuples needed to construct the reverse map

Map(1 -> Set("a", "b"),
    2 -> Set("b", "c"),
    3 -> Set("c", "d"))
.map{case (k, v)=>v.map{v2=>(v2,k)}}.flatten

produces

 List((a,1), (b,1), (b,2), (c,2), (c,3), (d,3))

Converting it directly to a map overwrites the values corresponding to duplicate keys though

Adding .groupBy{_._1} gets this

 Map(c -> List((c,2), (c,3)),
     a -> List((a,1)),
     d -> List((d,3)), 
     b -> List((b,1), (b,2)))

which is closer. To turn those lists into Sets of the second half of the pairs.

  .transform {(k, v)=>v.unzip._2.toSet}

gives

  Map(c -> Set(2, 3), a -> Set(1), d -> Set(3), b -> Set(1, 2))

QED :)

(*) YMMV

The Archetypal Paul
  • 41,321
  • 20
  • 104
  • 134
1

The easiest way I can think of is:

// unfold values to tuples (v,k)
// for all values v in the Set referenced by key k
def vk = for {
  (k,vs) <- m.iterator
  v <- vs.iterator
} yield (v -> k)

// fold iterator back into a map
(Map[String,Set[Int]]() /: vk) {
// alternative syntax: vk.foldLeft(Map[String,Set[Int]]()) {
  case (m,(k,v)) if m contains k =>
    // Map already contains a Set, so just add the value
    m updated (k, m(k) + v)
  case (m,(k,v)) =>
    // key not in the map - wrap value in a Set and return updated map
    m updated (k, Set(v))
}
Moritz
  • 14,144
  • 2
  • 56
  • 55
  • 1
    That looks correct, but I always find the /: syntax for folds really hard to read. Any chance you could annotate your answer with an explanation? – Marcus Downing Mar 31 '11 at 11:54
  • Added some comments. I personally prefer the `/:` to `foldLeft` and `:\ ` to `foldRight` as it implies the order of arguments in the function / pattern match. – Moritz Mar 31 '11 at 12:21
1

A simple, but maybe not super-elegant solution:

  def reverse[A,B](m:Map[A,Set[B]])={
      var r = Map[B,Set[A]]()
      m.keySet foreach { k=>
          m(k) foreach { e =>
            r = r + (e -> (r.getOrElse(e, Set()) + k))
          }
      }
      r
  }
hbatista
  • 1,207
  • 9
  • 12