I'm trying to consume a bunch of files from S3 in a streaming manner using akka streams:
S3.listBucket("<bucket>", Some("<common_prefix>"))
.flatMapConcat { r => S3.download("<bucket>", r.key) }
.mapConcat(_.toList)
.flatMapConcat(_._1)
.via(Compression.gunzip())
.via(Framing.delimiter(ByteString("\n"), Int.MaxValue))
.map(_.utf8String)
.runForeach { x => println(x) }
Without increasing akka.http.host-connection-pool.response-entity-subscription-timeout
I get
java.util.concurrent.TimeoutException: Response entity was not subscribed after 1 second. Make sure to read the response entity body or call discardBytes() on it.
for the second file, just after printing the last line of the first file, when trying to access the first line of the second file.
I understand the nature of that exception. I don't understand why the request to the second file is already in progress, while the first file is still being processed. I guess, there's some buffering involved.
Any ideas how to get rid of that exception without having to increase akka.http.host-connection-pool.response-entity-subscription-timeout
?