0

In Scala, I have a sequence val test : Seq[String] = Seq("table","bag","chair","chair")

I want to generate all the combinations of the above sequence, with order being important, and also count the times that combination is present in the sequence. i.e. (table,bag,1), (table,chair,2), (bag,table,1), (bag,chair,2), (chair,table,1), (chair,bag,2).

Also, I don't need to consider the combination of same element i.e. (chair,chair) has to be ignored. How could I do this?

Jeet Banerjee
  • 194
  • 2
  • 2
  • 12
  • you can use subsets method for this purpose and then filter which has length 2. – Raman Mishra Oct 22 '18 at 17:38
  • 1
    Why is `(bag,chair,2)` but `(chair,bag,1)`? If "chair" is in the sequence twice shouldn't every combination with "chair" come up with 2? – jwvh Oct 22 '18 at 17:41
  • Yeah, it is better to consider `(chair,bag,2)` – Jeet Banerjee Oct 22 '18 at 17:43
  • @RamanMishra if I use subset, then `(table, bag)` and `(bag,table)` becomes same, but I want to keep it different – Jeet Banerjee Oct 22 '18 at 17:47
  • @JeetBanerjee table, chair, 2 and chair, table, 1 why? – Raman Mishra Oct 22 '18 at 17:58
  • So question, what is the result you want for `val seq = Seq("One", "One", "Two", "Two")`? `List((One,Two,4), (Two,One,4))`? – Daniel Hinojosa Oct 22 '18 at 22:11
  • Also what is the result that you want for `Seq("One", "Two", "Two", "Two", "Two", "Three")`? Is it `List((One,Two,4), (One,Three,1), (Two,Three,4), (Three,Two,4), (Two,One,4), (Three,One,1))`? – Daniel Hinojosa Oct 22 '18 at 22:16
  • @jeet.. I have the answer for the other question that you posted today..53281994/spark-scala-variable-window-range-over-a-dataframe it was closed by community.. please raise a new question and I'll share the answer – stack0114106 Nov 13 '18 at 14:36
  • @stack0114106 hey thanks, but I found the answer to that question. [This](https://stackoverflow.com/questions/42448564/spark-sql-window-function-with-complex-condition) question was similar and I understood the proper approach to use spark sql window. – Jeet Banerjee Nov 13 '18 at 14:48

2 Answers2

0

you can do it like this:

val ans: Seq[(String, String)] = for {
    a <- test
    b <- test
    if a != b
  } yield {
      (a, b)
  }

    val result: Iterator[(String, String, Int)] = ans.groupBy(identity).mapValues(_.length).toIterator.map{
    case ((key, value), frequency)=> (key, value, frequency)
  }

  result1.foreach(println)

output

(bag,table,1)
(table,chair,2)
(chair,table,2)
(chair,bag,2)
(bag,chair,2)
(table,bag,1)
Raman Mishra
  • 2,635
  • 2
  • 15
  • 32
0

If you think about it, the count for any combination (pairing) is the count of the 1st element times the count of the 2nd element.

val ws = Seq("table","bag","chair","chair")
val count = ws.groupBy(identity).mapValues(_.length)
val result = ws.distinct.combinations(2).flatMap{ case Seq(a,b) =>
    val comboCount = count(a) * count(b)
    Seq((a,b,comboCount), (b,a,comboCount))
}
jwvh
  • 50,871
  • 7
  • 38
  • 64