2

I am trying to write a ruby script that gets some details about files on a website using net/http. My code looks like this:

require 'open-uri'
require 'net/http'

url = URI.parse asset
res = Net::HTTP.start(url.host, url.port) {|http|
  http.get(asset)
} 

headers = res.to_hash
p headers

I would like to get two pieces of information from this request: the total length of the content inflated, and (as appropriate) the length of the content deflated.

Sometimes, the headers will include a content-length parameter, which appears to be the gzipped length of the content. I can also approximate the inflated size of the content using res.body.length, but this has not been foolproof by any stretch of the imagination. The documentation on net/http says that gzip headers are removed from the list automatically (to help me, gee thanks) so I cannot seem to get a reliable handle on this information.

Any help is appreciated (including other gems if they will do this more easily).

Joe Mastey
  • 26,809
  • 13
  • 80
  • 104

2 Answers2

3

Got it! The "magic" behavior here only occurs if you don't specify your own accept-encoding header. Amended code as follows:

require 'open-uri'
require 'net/http'
require 'date'
require 'zlib' 

headers = { "accept-encoding" => "gzip;q=1.0,deflate;q=0.6,identity;q=0.3" }
url = URI.parse asset
res = Net::HTTP.start(url.host, url.port) {|http|
  http.get(asset, headers)
}

headers = res.to_hash

gzipped = headers['content-encoding'] && headers['content-encoding'][0] == "gzip"
content = gzipped ? Zlib::GzipReader.new(StringIO.new(res.body)).read : res.body 


full_length = content.length,
compressed_length = (headers["content-length"] && headers["content-length"][0] || res.body.length), 
Joe Mastey
  • 26,809
  • 13
  • 80
  • 104
0

You can try use sockets to send HEAD request to the server with is faster (no content) and don't send "Accept-Encoding: gzip", so your response will not be gzip.

jcubic
  • 61,973
  • 54
  • 229
  • 402