0

The example in the README is very elegant:

scala> Map(1 -> Max(2)) + Map(1 -> Max(3)) + Map(2 -> Max(4))
res0: Map[Int,Max[Int]] = Map(2 -> Max(4), 1 -> Max(3))

Essentially the use of Map here is equivalent to SQL's group by.

But how do I do the same with an arbitrary Aggregator? For example, to achieve the same thing as the code above (but without the Max wrapper class):

scala> import com.twitter.algebird._
scala> val mx = Aggregator.max[Int]
mx: Aggregator[Int,Int,Int] = MaxAggregator(scala.math.Ordering$Int$@78c77)
scala> val mxOfMap = // what goes here?
mxOfMap: Aggregator[Map[Int,Int],Map[Int,Int],Map[Int,Int]] = ...
scala> mxOfMap.reduce(List(Map(1 -> 2), Map(1 -> 3), Map(2 -> 4)))
res0: Map[Int,Int] = Map(2 -> 4, 1 -> 3)

In other words, how to I convert (or "lift") an Aggregator that operates on values of type T into an Aggregator that operates on values of type Map[K,T] (for some arbitrary K)?

Todd Owen
  • 15,650
  • 7
  • 54
  • 52

1 Answers1

0

Looks like this can be done fairly easily for Semigroup at least. This should be sufficient in the case where there is no additional logic in the "compose" or "present" phases of the aggregator which needs to be preserved (a Semigroup can be obtained from an Aggregator, discarding compose/prepare).

The code to answer the original question is:

scala> val sgOfMap = Semigroup.mapSemigroup[Int,Int](mx.semigroup)
scala> val mxOfMap = Aggregator.fromSemigroup(sgOfMap)
scala> mxOfMap.reduce(List(Map(1 -> 2), Map(1 -> 3), Map(2 -> 4)))
res0: Map[Int,Int] = Map(2 -> 4, 1 -> 3)

But in practice, it would be better to start by constructing the arbitrary Semigroup directly, rather than constructing an Aggregator merely to extract the semigroup from:

scala> import com.twitter.algebird._
scala> val mx = Semigroup.from { (x: Int, y: Int) => Math.max(x, y) }
scala> val mxOfMap = Semigroup.mapSemigroup[Int,Int](mx)
scala> mxOfMap.sumOption(List(Map(1 -> 2), Map(1 -> 3), Map(2 -> 4)))
res33: Option[Map[Int,Int]] = Some(Map(2 -> 4, 1 -> 3))

Alternatively, convert to aggregator: Aggregator.fromSemigroup(mxOfMap)

Todd Owen
  • 15,650
  • 7
  • 54
  • 52