3

I'm getting read_body called twice (IOError) using the net/http library. I'm trying to download files and use http sessions efficiently. Looking for some help or advice to fix my issues. From my debug message it appears when I log the response code, readbody=true. Is that why read_body is read twice when I try to write the large file in chunks?

D, [2015-04-12T21:17:46.954928 #24741] DEBUG -- : #<Net::HTTPOK 200 OK readbody=true>
I, [2015-04-12T21:17:46.955060 #24741]  INFO -- : file found at http://hidden:8080/job/project/1/maven-repository/repository/org/project/service/1/service-1.zip.md5
/usr/lib/ruby/2.2.0/net/http/response.rb:195:in `read_body': Net::HTTPOK#read_body called twice (IOError)
    from ./deploy_application.rb:36:in `block in get_file'
    from ./deploy_application.rb:35:in `open'
    from ./deploy_application.rb:35:in `get_file'
    from ./deploy_application.rb:59:in `block in <main>'
    from ./deploy_application.rb:58:in `each'
    from ./deploy_application.rb:58:in `<main>'
require 'net/http'
require 'logger'

STAMP = Time.now.utc.to_i

@log = Logger.new(STDOUT)

# project , build, service remove variables above
project = "project"
build = "1"
service = "service"
version = "1"
BASE_URI = URI("http://hidden:8080/job/#{project}/#{build}/maven-repository/repository/org/#{service}/#{version}/")

# file pattern for application is zip / jar. Hopefully the lib in the zipfile is acceptable.
# example for module download /#{service}/#{version}.zip /#{service}/#{version}.zip.md5 /#{service}/#{version}.jar /#{service}/#{version}.jar.md5

def clean_exit(code)
  # remove temp files on exit
end

def get_file(file)
  puts BASE_URI 
  uri = URI.join(BASE_URI,file)
  @log.debug(uri)

  request = Net::HTTP::Get.new uri #.request_uri
  @log.debug(request)

  response = @http.request request
  @log.debug(response)

  case response
    when Net::HTTPOK
      size = 0
      progress = 0
      total = response.header["Content-Length"].to_i
      @log.info("file found at #{uri}")

      # need to handle file open error 
      Dir.mkdir "/tmp/#{STAMP}"
      File.open "/tmp/#{STAMP}/#{file}", 'wb' do |io|
        response.read_body do |chunk|
          size += chunk.size
          new_progress = (size * 100) / total

          unless new_progress == progress
             @log.info("\rDownloading %s (%3d%%) " % [file, new_progress])
          end

          progress = new_progress
          io.write chunk
        end
      end

    when 404
      @log.error("maven repository file #{uri} not found")
      exit 4

    when 500...600
      @log.error("error getting #{uri}, server returned #{response.code}")
      exit 5

    else
      @log.error("unknown http response code #{response.code}")
  end
end

@http = Net::HTTP.new(BASE_URI.host, BASE_URI.port)
files = [ "#{service}-#{version}.zip.md5", "#{service}-#{version}.jar", "#{service}-#{version}.jar.md5" ].each do |file| #"#{service}-#{version}.zip",
  get_file(file)
end
Jordan Running
  • 102,619
  • 17
  • 182
  • 182

1 Answers1

6

Edit: Revised answer!

Net::HTTP#request, when called without a block, will pre-emptively read the body. The documentation isn't clear about this, but it hints at it by suggesting that the body is not read if a block is passed.

If you want to make the request without reading the body, you'll need to pass a block to the request call, and then read the body from within that. That is, you want something like this:

@http.request request do |response|
  # ...
  response.read_body do |chunk|
  # ...
  end
end

This is made clear in the implementation; Response#reading_body will first yield the unread response to a block if given (from #transport_request, which is called from #request), then read the body unconditionally. The block parameter to #request gives you that chance to intercept the response before the body is read.

Chris Heald
  • 61,439
  • 10
  • 123
  • 137
  • You should mention that you're the author of a piece of code when you recommend it. – Jazzepi Apr 13 '15 at 05:22
  • I thought I already tried that, and I did. Even after commenting the response and request logging, I still get the same error. #@log.debug(response) /usr/lib/ruby/2.2.0/net/http/response.rb:195:in `read_body': Net::HTTPOK#read_body called twice (IOError) from ./deploy_hadoop_application.rb:36:in `block in get_file' from ./deploy_hadoop_application.rb:35:in `open' from ./deploy_hadoop_application.rb:35:in `get_file' from ./deploy_hadoop_application.rb:59:in `block in
    ' from ./deploy_hadoop_application.rb:58:in `each' from ./deploy_hadoop_application.rb:58:in `
    '
    – Colin Williams Apr 13 '15 at 05:34
  • I don't see anything else that could cause that; `read_body` and `body` are the only methods that will set `@read = true`, and there's not obvious reuse of the response happening there. Have you tried using pry to step through the code to see if it's behaving contrary to expectations? Is that the complete log before it terminates? – Chris Heald Apr 13 '15 at 05:49
  • I ran into this and the problem was that while I was using `http.request(req) do |response|`, in my own method I was then returning the response rather than yielding it or applying a block to `read_body`, which meant that the `yield` that was supposed to yield the response without invoking `body` never took effect. It would be nice if this could be managed a little more explicitly. – David Moles May 29 '15 at 20:02