1

I'm trying to implement a scalaz-stream channel that accumulates statistics about the events it receives and, once complete, emits the final statistics.

To give a concrete, simplified example: imagine that you have a Process[Task, String] where each string is a word. I'd like to have a Channel[Task, String, (String, Int)] that, when applied to that initial process, would drain it, count the number of times each word occurs, and emit that.

I realise this is trivial through a fold:

input.foldMap(w => Map(w -> 1))
     .flatMap(m => Process.emitAll(m.toSeq))
     .maximumBy(_._2)

What I'm trying to write is a collection of standard accumulators that I can then just pipe my processes through - rather than explicitly fold, say, I'd write:

input.through(wordFrequency)
     .maximumBy(_._2)

I'm at a bit of a loss though - I can't work out how to do so without sharing state. Writing a Sink that accumulate to a Map[String, Int] is fairly simple, but there's no way to get the final state of the map and emit it once the sink has terminated.

Nicolas Rinaudo
  • 6,068
  • 28
  • 41

0 Answers0