3

I have a nested dictionary Map<'a,Map<'b,'T>>, so that for a combination of a*b, the entry is unique.

In order to precompute efficiently, I would need to invert the keys in a Map<'b,Map<'a,'T>>

I have some higher order methods that do the job (|/> will apply the operation in a nested sequence |//> the same, but 2 levels deep, |*> will enumerate the cartesian product of nested sequence), but I am wondering if there is a better way to do this, just in case there is beautiful code to share on this one.

let reversenmap (x:Map<'a,Map<'b,'T>>) :Map<'b,Map<'a,'T>> = 
      let ret  = x |> Map.toSeq |/> Map.toSeq |*> squash12
      let ret2 = ret |> Seq.groupByn2 (fun (a,b,t) -> b) 
                                      (fun (a,b,t) -> a) |//> Seq.head 
                                                         |//> (fun (a,b,c) -> c)
      ret2 |> Seq.toMapn2
pad
  • 41,040
  • 7
  • 92
  • 166
nicolas
  • 9,549
  • 3
  • 39
  • 83

3 Answers3

5

I think the solution from @pad is definitely more idiomatic F# than using non-standard operators like |/> and |*>. I would probably prefer a version that uses sequence expressions instead of Seq.collect, which looks like this (the second part is the same as in the version from @pad):

let reverse (map: Map<'a,Map<'b,'T>>) = 
  [ for (KeyValue(a, m)) in map do
      for (KeyValue(b, v)) in m do yield b, (a, v) ]
  |> Seq.groupBy fst 
  |> Seq.map (fun (b, ats) -> b, ats |> Seq.map snd |> Map.ofSeq) 
  |> Map.ofSeq 
Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553
  • Sequence expression makes enumeration quite natural. having nonstandard operator is indeed an inferior solution here. – nicolas Mar 22 '12 at 13:35
  • Shouldn't a sequence expression use `seq { ... }` rather than `[ ... ]`? That reminds me, is there any practical difference between `seq { 1..5 }` and `{ 1..5 }`? – Joel Mueller Mar 22 '12 at 16:41
  • @JoelMueller Did you mean `[ 1..5 ]` rather than `{ 1..5 }`? If so, the difference is in both the resulting type and evaluation method. A `seq { ... }` returns a `seq<'a>` and is evaluated lazily, just like `IEnumerable` (of which it is an alias). `[ ... ]` returns a `'a list` and is immediately evaluated. You can return an infinitely long `seq<'a>`, but you cannot do so with a `'a list`. –  Mar 23 '12 at 02:57
  • @RyanRiley - No, I meant that `seq { 1..5 }` and `{ 1..5 }` both produce a `seq` containing the same values. I was just wondering if there's a performance advantage to leaving out the `seq` keyword, or a correctness advantage to including it, or if it doesn't really matter. – Joel Mueller Mar 23 '12 at 16:34
  • 3
    @JoelMueller: You're able to write `{1..5}` since `..` is range expression. In general cases, only `seq { ... }` is valid. – pad Mar 24 '12 at 13:15
2

I'm not sure I understand your intention. But from the signature of your function, we could do something like this:

let reverse (map: Map<'a,Map<'b,'T>>) =
    map |> Seq.collect (fun (KeyValue(a, m)) -> 
                                m |> Seq.map (fun (KeyValue(b, t)) -> b, (a, t)))
        |> Seq.groupBy fst
        |> Seq.map (fun (b, ats) -> b, ats |> Seq.map snd |> Map.ofSeq)
        |> Map.ofSeq
pad
  • 41,040
  • 7
  • 92
  • 166
2

@pad's solution is remarkably similar to what I came up – I guess it just goes to show that with these sorts of problems, you follow your nose doing the only things that could work until you get there.

Alternatively, if you wanted to stick to folds, you could do:

let invertNesting ( map : Map<'a, Map<'b, 'c>> ) =
    let foldHelper ( oldState : Map<'b, Map<'a, 'c>> ) ( k1 : 'a ) ( innerMap : Map<'b, 'c> =
        innerMap |> Map.fold (fun tempState k2 v ->
                                  let innerMap' = match ( tempState |> Map.tryFind k2 ) with
                                                  | Some(m) -> m
                                                  | None -> Map.empty
                                  let innerMap'' = innerMap' |> Map.add k1 v
                                  tempState |> Map.add k2 innerMap'' ) oldState
    map |> Map.fold foldHelper Map.empty

While @Tomas Petricek's solution is more readable to me, this appears to be about 25% faster.

Mike Lynch
  • 451
  • 6
  • 13
  • very interesting. I remember having to write something similar but wrote it naively in 2 pass. – nicolas Mar 23 '12 at 08:56
  • how do you time your code btw ? just a regular timer from the Diagnostics namespace ? – nicolas Mar 23 '12 at 08:57
  • In hindsight, that last comment really didn't give enough detail to be useful. Yes @nicolas, I used the Stopwatch class from the System.Diagnostics namespace for timing, running each test multiple times and averaging their runtimes. This was where the sizes of the inner and outer dictionaries were equal. The speedup also seems to reduce as the sizes of the dictionaries increases – this starts off over twice as fast, but decreases to about 25% faster for the largest sizes I tested. – Mike Lynch Mar 26 '12 at 12:31