0

I am trying to stream a zip file.

The following chunk of code prints line by line as expected:

val inputStream = new GZIPInputStream(new FileInputStream("/some/path"))
val source = Source.fromInputStream(inputStream)
for(line <- source.getLines) {
  println(line)
}

But this one doesn't do anyting (it doesn't even exit):

val inputStream = new ZipInputStream(new FileInputStream("/some/path"))
val source = Source.fromInputStream(inputStream)
for(line <- source.getLines) {
  println(line)
}

The only difference is the usage of GZIPInputStream instead of ZipInputStream. Both class implements InputStream.

Am I missing something ? Or is there any workaroud ?

Alexandre Annic
  • 9,942
  • 5
  • 36
  • 50

1 Answers1

4

Gzip is just a compressed file, that can be de-compressed on the fly as you are reading from Source. Zip isn't really a stream, it's just one of many java misnomers (take a look at the interface), it's more like a directory, containing several files, that you can traverse via ZipEntry, and read each one separately via Source. There is no real content at the top level, just a directory listing, so no "lines" to fetch via Source.

In a nutshell, you just iterate through entries, creating a new Source for each one. Something like this:

   Iterator
     .continually(zip.getNextEntry)
     .takeWhile(_ != null)
     .map { e => 
        e.getName -> Source.fromInputStream(zip).getLines.toList
     } 

(This creates a Map of name of each file in zip to its entire content n memory, probably, not what you want at all, just an illustration of what you can do to access that content via Source)

Dima
  • 39,570
  • 6
  • 44
  • 70
  • Yeah I just realized that. What I'm trying to do is to unzip and read a really large file from an URL. I don't want to store it. If you know a way to do it you would make my day. – Alexandre Annic May 03 '21 at 16:56
  • 2
    Well, in a nutshell, you just iterate through entries, creating a new `Source` for each one. Something like this: `Iterator.continually(zip.getNextEntry).takeWhile(_ != null).map { e => e.getName -> Source.fromInputStream(zip).getLines.toList)` – Dima May 03 '21 at 17:28