I'm trying to implement a scalaz-stream channel that accumulates statistics about the events it receives and, once complete, emits the final statistics.
To give a concrete, simplified example: imagine that you have a Process[Task, String]
where each string is a word. I'd like to have a Channel[Task, String, (String, Int)]
that, when applied to that initial process, would drain it, count the number of times each word occurs, and emit that.
I realise this is trivial through a fold:
input.foldMap(w => Map(w -> 1))
.flatMap(m => Process.emitAll(m.toSeq))
.maximumBy(_._2)
What I'm trying to write is a collection of standard accumulators that I can then just pipe my processes through - rather than explicitly fold, say, I'd write:
input.through(wordFrequency)
.maximumBy(_._2)
I'm at a bit of a loss though - I can't work out how to do so without sharing state. Writing a Sink
that accumulate to a Map[String, Int]
is fairly simple, but there's no way to get the final state of the map and emit it once the sink has terminated.