I'm writing a small exercise app that calculates number of unique letters (incl Unicode) in a seq
of strings
, and I'm using aggregate
for it, as I try to run in parallel
here's my code:
class Frequency(seq: Seq[String]) {
type FreqMap = Map[Char, Int]
def calculate() = {
val freqMap: FreqMap = Map[Char, Int]()
val pattern = "(\\p{L}+)".r
val seqop: (FreqMap, String) => FreqMap = (fm, s) => {
s.toLowerCase().foldLeft(freqMap){(fm, c) =>
c match {
case pattern(char) => fm.get(char) match {
case None => fm+((char, 1))
case Some(i) => fm.updated(char, i+1)
}
case _ => fm
}
}
}
val reduce: (FreqMap, FreqMap) => FreqMap =
(m1, m2) => {
m1 ++ m2.map { case (k, v) => k -> (v + m1.getOrElse(k, 0)) }
}
seq.par.aggregate(freqMap)(seqop, reduce)
}
}
and then the code that makes use of that
object Frequency extends App {
val text = List("abc", "abc", "abc", "abc", "abc", "abc", "abc", "abc", "abc");
def frequency(seq: Seq[String]):Map[Char, Int] = {
new Frequency(seq).calculate()
}
Console println frequency(seq=text)
}
though I supplied "abc" 9 times, the result is Map(a -> 8, b -> 8, c -> 8)
, as it is for any number of "abc"
's > 8
I've looked at this, and it seems like I'm using aggregate correctly
Any suggestions to make it work?