I am surprised that this throws an out of memory error considering that the operations are on top of an scala.collection.Iterator. The size of the individual lines are small (< 1KB)
Source.fromFile("largefile.txt").getLines.map(_.size).max
It appears it is trying to load the entire file in memory. Not sure which step triggers this. This is disappointing behavior for such a basic operation. Is there a simple way around it. And any reason for this design by the library implementors ?
Tried the same in Java8.
Files.lines(Paths.get("largefile.txt")).map( it -> it.length() ).max(Integer::max).get
//result: 3131
And this works predictably. Files.lines returns java.util.stream.Stream and the heap does not explode.
update: Looks like it boils down to new line interpretation. Both files are being interpreted as UTF-8, and down the line they both call java.io.BufferedReader.readLine(). So, still need to figure out where the discrepancy is. And I compiled both snippets Main classes in to the same project jar.