3

Similar to "getting the status code of a HTTP redirected page", but with NET::HTTP instead of curb I am making a GET request to a page that that will redirect:

response = Net::HTTP.get_response(URI.parse("http://www.wikipedia.org/wiki/URL_redirection"))
puts response.code #{
puts response['location']

=> 301 
en.wikipedia.org/wiki/URL_redirection

The problem is that I want to know the status code of the redirected page. In this case it is 200, but in my app I want to check if it is 200 or something else. The solution I've seen is to just call get_response(response['location']), but that won't work in my application because the way the redirect is designed makes it so that the redirect can only be followed once. Since the first GET consumes that one redirect, I can't then follow it again.

Is there some way to get the last status code that is a result of a GET?


EDIT: Further clarification of the situation:

The application that I'm sending GET to has a single sign-on authentication mechanism where, if I want to access 'myapp/mypage', I have to first send a post:

postResponse = Net::HTTP.post_form(URI.parse("http://myapp.com/trusted"), {"username" => @username})

Then make the GET request to:

'http://myapp.com/trusted/#{postResponse.body}/mypage

*The postResponse.body is a 'ticket' which can be redeemed once.

That GET verifies that the ticket is valid and then redirects to:

myapp.com/mypage

So whether that ticket is valid or not, I get a 301.

I want to check the status code of the final get to myapp.com/mypage.

If I manually try to follow the redirect, whether it's a HEAD request or a GET, the original redirect will have already consumed the ticket, so I will get an error that the ticket is expired even if the original redirect was a 200.

Community
  • 1
  • 1
Mike Kovner
  • 31
  • 1
  • 3
  • Can you simply make an `HTTP HEAD` request to the redirected location to check its status without 'consuming' the URL? If so, `Net::HTTP` supports this and I can give you a few pointers. – struthersneil Nov 01 '13 at 19:10

2 Answers2

1

The Net::HTTP documentation has example code showing how to deal with redirects. Have you tried it? It should make it easy to get inside the redirect mechanism and grab statuses for later.

Here's their example:

Following Redirection

Each Net::HTTPResponse object belongs to a class for its response code.

For example, all 2XX responses are instances of a Net::HTTPSuccess subclass, a 3XX response is an instance of a Net::HTTPRedirection subclass and a 200 response is an instance of the Net::HTTPOK class. For details of response classes, see the section “HTTP Response Classes” below.

Using a case statement you can handle various types of responses properly:

def fetch(uri_str, limit = 10)
  # You should choose a better exception.
  raise ArgumentError, 'too many HTTP redirects' if limit == 0

  response = Net::HTTP.get_response(URI(uri_str))

  case response
  when Net::HTTPSuccess then
    response
  when Net::HTTPRedirection then
    location = response['location']
    warn "redirected to #{location}"
    fetch(location, limit - 1)
  else
    response.value
  end
end

print fetch('http://www.ruby-lang.org')

A minor change like this should help:

require 'net/http'

RESPONSES = []
def fetch(uri_str, limit = 10)
  # You should choose a better exception.
  raise ArgumentError, 'too many HTTP redirects' if limit == 0

  response = Net::HTTP.get_response(URI(uri_str))

  RESPONSES << response

  case response
  when Net::HTTPSuccess then
    response
  when Net::HTTPRedirection then
    location = response['location']
    warn "redirected to #{location}"
    fetch(location, limit - 1)
  else
    response.value
  end
end

print fetch('http://jigsaw.w3.org/HTTP/300/302.html')
puts RESPONSES.join("\n") # =>

I see this when I run it:

redirected to http://jigsaw.w3.org/HTTP/300/Overview.html
#<Net::HTTPOK:0x007f9e82a1e050>#<Net::HTTPFound:0x007f9e82a2daa0>
#<Net::HTTPOK:0x007f9e82a1e050>
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
0

If it's enough just to make an HTTP HEAD request without 'consuming' your URL (this would be the usual expectation for a HEAD request), you can do it like this:

2.0.0-p195 :143 > result = Net::HTTP.start('www.google.com') { |http| http.head '/' }
 => #<Net::HTTPFound 302 Found readbody=true> 

So in your example you'd do this:

 ...
 result = Net::HTTP.start(response.uri.host) { |http| http.head response.uri.path }

If you want to preserve a history of response codes, you could try this. This retains the last 5 response codes from calls to get_response and exposes them through a Net::HTTP.history method.

module Net
  class << HTTP
    alias_method :_get_response, :get_response

    def get_response *args, &block
      resp = _get_response *args, &block
      @history = (@history || []).push(resp.code).last 5
      resp
    end

    def history 
      @history || []
    end
  end
end

(I don't entirely get the usage scenario, so adapt to your needs)

struthersneil
  • 2,700
  • 10
  • 11
  • This is for the initial get, and then I'll follow the result.['location'] or is this the second get, following the redirect? – Mike Kovner Nov 01 '13 at 20:36
  • The HEAD approach would be for the second request. The approach that records the history of response codes would record both--`Net::HTTP.history.last` would be your last response code. – struthersneil Nov 01 '13 at 20:39
  • I probably should have clarified the use-case a bit better. Commenting on my original post with most details. – Mike Kovner Nov 01 '13 at 21:21
  • Feel free to update your question if this isn't what you need. – struthersneil Nov 01 '13 at 21:26