3

I am kind of new to ruby and from a python background I want to make a head request to a URL and check some information like if the file exists on the server and timestamp, etag etc.,I am not able to get this done in RUBY.

In Python:

import httplib2
print httplib2.Http().request('url.com/file.xml','HEAD')

In Ruby: I tried this and throwing some error

require 'net/http'

Net::HTTP.start('url.com'){|http|
   response = http.head('/file.xml')
}
puts response


SocketError: getaddrinfo: nodename nor servname provided, or not known
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:877:in `initialize'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:877:in `open'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:877:in `block in connect'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/timeout.rb:51:in `timeout'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:876:in `connect'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:861:in `do_start'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:850:in `start'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:582:in `start'
    from (irb):2
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/bin/irb:16:in `<main>'
Krish
  • 467
  • 1
  • 6
  • 16

3 Answers3

8

I realize this has been answered but I had to go through some hoops, too. Here's something more concrete to start with:

#!/usr/bin/env ruby

require 'net/http'
require 'net/https' # for openssl

uri = URI('http://stackoverflow.com')
path = '/questions/16325918/making-head-request-in-ruby'

response=nil
http = Net::HTTP.new(uri.host, uri.port)
# http.use_ssl = true                            # if using SSL
# http.verify_mode = OpenSSL::SSL::VERIFY_NONE   # for example, when using self-signed certs

response = http.head(path)
response.each { |key, value| puts key.ljust(40) + " : " + value }
Mike D
  • 5,984
  • 4
  • 31
  • 31
6

I don't think that passing in a string to :start is enough; in the docs it looks like it requires a URI object's host and port for a correct address:

uri = URI('http://example.com/some_path?query=string')

Net::HTTP.start(uri.host, uri.port) do |http|
  request = Net::HTTP::Get.new uri

  response = http.request request # Net::HTTPResponse object
end

You can try this:

require 'net/http'

url = URI('yoururl.com')

Net::HTTP.start(url.host, url.port){|http|
   response = http.head('/file.xml')
   puts response
}

One thing I noticed - your puts response needs to be inside the block! Otherwise, the variable response is not in scope.

Edit: You can also treat the response as a hash to get the values of the headers:

response.each_value { |value| puts value }
hlh
  • 1,992
  • 15
  • 24
  • thank you. priti, the url what I am trying is internal you cant access it. But in general,its a url to download a xml file. I dont want to download it before I know about it , like is it stale,duplicate etc., so head request doesnt download it but instead fetches the properties – Krish May 01 '13 at 21:05
  • i tried your second method it worked but I am getting only this value back "#", I am expecting whole lot of header information and properties about the file – Krish May 01 '13 at 21:06
  • I am expecting information like this ({'status': '200', 'content-length': '2983', 'accept-ranges': 'bytes', 'server': 'Apache/2.2.17 (Unix)', 'last-modified': 'Wed, 01 May 2013 20:53:26 GMT', 'etag': '"5f56a-ba7-4dbae4f35555"', 'date': 'Wed, 01 May 2013 21:11:30 GMT', 'content-type': 'application/xml'}, '') – Krish May 01 '13 at 21:11
  • Yes. If you look [at the documentation](http://ruby-doc.org/stdlib-2.0/libdoc/net/http/rdoc/Net/HTTP.html#method-i-head) you'll see that the :head method returns an HTTPResponse object that embodies the response status code (here, it's 200 OK). You can print headers in a hash format, like `puts response['content-type']` – hlh May 01 '13 at 21:13
  • but when I do response.body it has the actual content of the file. I dont want to download the content. Because I have like 100s of files at server and they really huge like 800 MB.. so it will take up my sys mem and slows down the calls. So I just need to do HEAD request and get the properties of the file alone – Krish May 01 '13 at 21:24
  • hlh, thank you. your second method worked well. It did not download any content but the same time when i do response[''] it worked – Krish May 01 '13 at 21:30
  • Krish, you can also treat the response like a hash to get the header values. See my edit. Happy it worked for you. – hlh May 01 '13 at 21:37
3
headers = nil

url = URI('http://my-bucket.amazonaws.com/filename.mp4')

Net::HTTP.start(url.host, url.port) do |http|
  headers = http.head(url.path).to_hash
end

And now you have a hash of headers in headers

Nicolas Maloeuvre
  • 3,069
  • 24
  • 42