I have the following problem. I am querying a server for some data and getting it back as HttpEntity.Chunked. The response String looks like this with up to 10.000.000 lines like this:
[{"name":"param1","value":122343,"time":45435345},
{"name":"param2","value":243,"time":4325435},
......]
Now I want to get the incoming data into and Array[String] where each String is a line from the response, because later on it should be imported into an apache spark dataframe. Currently I am doing it likes this:
//For the http request
trait StartHttpRequest {
implicit val system: ActorSystem
implicit val materializer: ActorMaterializer
def httpRequest(data: String, path: String, targetPort: Int, host: String): Future[HttpResponse] = {
val connectionFlow: Flow[HttpRequest, HttpResponse, Future[OutgoingConnection]] = {
Http().outgoingConnection(host, port = targetPort)
}
val responseFuture: Future[HttpResponse] =
Source.single(RequestBuilding.Post(uri = path, entity = HttpEntity(ContentTypes.`application/json`, data)))
.via(connectionFlow)
.runWith(Sink.head)
responseFuture
}
}
//result of the request
val responseFuture: Future[HttpResponse] = httpRequest(.....)
//convert to string
responseFuture.flatMap { response =>
response.status match {
case StatusCodes.OK =>
Unmarshal(response.entity).to[String]
}
}
//and then something like this, but with even more stupid stuff
responseFuture.onSuccess { str:String =>
masterActor! str.split("""\},\{""")
}
My question is, what would be a better way to get the result into an array? How can I unmarshall the response entity directly? Because .to[Array[String]] for example did not work. And because there are so many lines coming, could I do it with a stream, to be more efficent?