0

I'm using Dispatch to pull down a lot of pages from which I only need the first few K, and the pages are sometimes gigabytes. Is there any way in scala-dispatch (dispatch/reboot), or maybe in the HTTP request, to truncate the body received?

(Context: I'm reading CSV files from public data sources, and am just trying to get the field names (header row) and one row of sample data.)

Ed Staub
  • 15,480
  • 3
  • 61
  • 91

1 Answers1

1

You can using the > handler, which gives you access to the underlying com.ning.http.client.Response instance. From there, it's simple:

import java.io._
import dispatch._, Defaults._
import com.ning.http.client.Response

def excerpt(bytes: Int) = {
  response: Response =>
    response.getResponseBodyExcerpt(100, "UTF-8")
}

def lines(count: Int) = {
  response: Response =>
    val stream = response.getResponseBodyAsStream
    val reader = new BufferedReader(new InputStreamReader(stream))
    Stream.continually(reader.readLine()).take(count).toList
}

val u = url("http://stackoverflow.com/")
Http(u > excerpt(100)).onComplete(println)
Http(u > lines(2)).onComplete(println)

You might also try to request a smaller byte interval from the server using the Range header. This requires server support, which can be tested using a HEAD request and then looking up the Accept-Ranges: bytes response header.

Ionuț G. Stan
  • 176,118
  • 18
  • 189
  • 202