5

I'm using akka-http to make a request to a http service which sends back chunked response. This is how the relevant bit of code looks like:

val httpRequest: HttpRequest = //build the request
val request = Http().singleRequest(httpRequest)
request.flatMap { response =>
    response.entity.dataBytes.runForeach { chunk =>
        println("-----")
        println(chunk.utf8String)
    }
}

and the output produced in the command line looks something like this:

-----
{"data":
-----
"some text"}

-----
{"data":
-----
"this is a longer
-----
text"}

-----
{"data": "txt"}

-----
...

The logical piece of data - a json in this case ends with an end of line symbol \r\n, but the problem is, that the json doesn't always fit in a single http response chunk as clearly visible in the example above.

My question is - how do I concatenate the incoming chunked data into full jsons so that the resulting container type would still remain either Source[Out,M1] or Flow[In,Out,M2]? I'd like to follow the idealogy of akka-stream.

UPDATE: It's worth mentioning also, that the response is endless and the aggregation must be done in real time

Caballero
  • 11,546
  • 22
  • 103
  • 163

3 Answers3

4

Found a solution:

val request: HttpRequest = //build the request
request.flatMap { response =>
    response.entity.dataBytes.scan("")((acc, curr) => if (acc.contains("\r\n")) curr.utf8String else acc + curr.utf8String)
        .filter(_.contains("\r\n"))
        .runForeach { json =>
            println("-----")
            println(json)
        }
}
Caballero
  • 11,546
  • 22
  • 103
  • 163
  • What exactly does the function scan do ? There is no documentation about it. Can you please explain ? – MaatDeamon Nov 11 '15 at 16:03
  • @MaatDeamon Actually there is: "Similar to fold but is not a terminal operation, emits its current value which starts at zero and then applies the current and next value to the given function f, emitting the next current value." (http://doc.akka.io/api/akka-stream-and-http-experimental/1.0/index.html#akka.stream.scaladsl.Source). The way I understand it it's like a fold, but can be applied to a continuous stream. Without it this solution would never work. – Caballero Nov 11 '15 at 16:10
  • Also, is the chunking of response handle automatically? What i mean is that your callback is being valid for each chunked ? "being called" – MaatDeamon Nov 11 '15 at 16:10
0

The akka stream documentation has an entry in the cookbook for this very problem: "Parsing lines from a stream of ByteString". Their solution is quite verbose but can also handle the situation where a single chunk can contain multiple lines. This seems more robust since the chunk size could change to be big enough to handle multiple json messages.

Ramón J Romero y Vigil
  • 17,373
  • 7
  • 77
  • 125
  • Link updated to Akka 2.4: http://doc.akka.io/docs/akka/2.4.2/scala/stream/stream-cookbook.html#Parsing_lines_from_a_stream_of_ByteStrings – akauppi May 24 '16 at 11:53
  • akka-http will also soon contain specific framing support for JSON. An early preview of this support can be seen in @ktoso's example projects. Here is a direct link to the relevant JSON framing code: https://github.com/ktoso/scaladays-berlin-akka-streams/blob/master/src/main/scala/akka/http/scaladsl/server/JsonEntityStreaming.scala – Age Mooij Jun 22 '16 at 13:27
0
response.entity.dataBytes
.via(Framing.delimiter(ByteString("\n"), maximumFrameLength = 8096))
.mapAsyncUnordered(Runtime.getRuntime.availableProcessors()) { data =>
  if (response.status == OK) {
    val event: Future[Event] = Unmarshal(data).to[Event]
    event.foreach(x => log.debug("Received event: {}.", x))
    event.map(Right(_))
  } else {
    Future.successful(data.utf8String)
      .map(Left(_))
  }
}

The only requirement is you know the maximum size of one record. If you start with something small, the default behavior is to fail if the record is larger than the limit. You can set it to truncate instead of failing, but piece of a JSON makes no sense.

Abhijit Sarkar
  • 21,927
  • 20
  • 110
  • 219