I have an Iterator[Record]
which is ordered on record.id
this way:
record.id=1
record.id=1
...
record.id=1
record.id=2
record.id=2
..
record.id=2
Records of a specific ID could occur a large number of times, so I want to write a function that takes this iterator as input, and returns an Iterator[Iterator[Record]]
output in a lazy manner.
I was able to come up with the following, but it fails on StackOverflowError
after 500K records or so:
def groupByIter[T, B](iterO: Iterator[T])(func: T => B): Iterator[Iterator[T]] = new Iterator[Iterator[T]] {
var iter = iterO
def hasNext = iter.hasNext
def next() = {
val first = iter.next()
val firstValue = func(first)
val (i1, i2) = iter.span(el => func(el) == firstValue)
iter = i2
Iterator(first) ++ i1
}
}
What am I doing wrong?