18

If http://foo.com redirects to 1.2.3.4 which then redirects to http://finalurl.com, how can I use Ruby to find out the landing URL "http://finalurl.com"?

stivlo
  • 83,644
  • 31
  • 142
  • 199
Mark
  • 39,169
  • 11
  • 42
  • 48
  • Please show some sample code so we can tell what HTTP client you are using. – the Tin Man Feb 01 '11 at 20:48
  • I used [final_redirect_url](https://rubygems.org/gems/final_redirect_url) gem to get the final redirected url. It simply returns the final URL as string. – Indyarocks May 03 '17 at 05:20

5 Answers5

25

Here's two ways, using both HTTPClient and Open-URI:

require 'httpclient'
require 'open-uri'

URL = 'http://www.example.org'

httpc = HTTPClient.new
resp = httpc.get(URL)
puts resp.header['Location']
>> http://www.iana.org/domains/example/

open(URL) do |resp|
  puts resp.base_uri.to_s
end
>> http://www.iana.org/domains/example/
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • 1
    It's better to use httpc.head(URL) instead of httpc.get(URL). This prevents the whole site from loading. – coderberry Jan 23 '13 at 19:06
  • 1
    Agreed, *IF* the host would/could do a redirect on a HEAD. I've seen HEAD responses that show an error with no redirect. I think that's because a HEAD is more exploratory. And that behavior might have been isolated to certain HTTPd and the standards, or versions, changed so it's no longer an issue. – the Tin Man Jan 23 '13 at 23:01
  • I compared the http, curl and open methods and the results are fairly inconsistent. Some give results for urls for which others do not. I am starting to wonder how many different cases a web browser covers to get this consistent. I wish I had the same for ruby. – Jackson Henley Jun 23 '13 at 02:11
  • @JacksonHenley Could you give some example url that will give inconsistent results? – lulalala Aug 08 '13 at 02:49
  • Browsers do a lot of fixups, from what to do with bad URLs, to bad HTML. Their goal is to return *something* to the user, even if its mangled, because the user's brain can probably glean something usable from it. That's not the same with tools like cURL, Open::URI, etc. They have to have accurate and correct URLs. Parsers like Nokogiri want correct HTML and XML. Nokogiri does do some fix-ups to try to return something that parses correctly, but sometimes it gets the fix-up wrong and we have to intervene, before Nokogiri gets the data, and we fix it then pass it on. – the Tin Man Aug 08 '13 at 14:36
  • @JacksonHenley "I wish I had the same for ruby." The only way to get the same behavior as a browser, is to use a browser. See the [Watir](http://watir.com/) project for that. But, be careful what you ask for. The HTML returned isn't necessarily what your original page's request was, because dynamic HTML could be loading/removing page parts or moving them around in the DOM after browser-sniffing. – the Tin Man Aug 08 '13 at 14:39
  • People coming here who are dealing with a redirects for she first time should note a that just because it returns a value that doesn't mean that is the url you want. It might just return a path that will lead you to another page with another url in the response, which will lead you to yet another url which is the final one. – MCB Feb 20 '14 at 22:52
3

Another way, using Curb:

def get_redirected_url(your_url)
  result = Curl::Easy.perform(your_url) do |curl|
    curl.follow_location = true
  end
  result.last_effective_url
end 
fsabattier
  • 163
  • 2
  • 9
2

for JRuby this worked

def get_final_url (url)
    final_url = ""
    until url.nil? do
      final_url = url
      url = Net::HTTP.get_response(URI.parse(url))['location']
    end

    final_url
  end
Maged Makled
  • 1,918
  • 22
  • 25
1

I have implemented a RequestResolver for my need:

https://gist.github.com/lulalala/6be104641bcb60f9d0e8

It uses Net::HTTP, and follows multiple redirects. It also handles relative redirects. It was for my simple need so may have bugs. If you discover one please tell me.

lulalala
  • 17,572
  • 15
  • 110
  • 169
0

I'm not much of a Ruby user, but what you basically need is something to interpret HTTP headers. The following library appears to do that:

http://www.ensta.fr/~diam/ruby/online/ruby-doc-stdlib/libdoc/net/http/rdoc/classes/Net/HTTP.html

Skip down to "following redirection."

Steve Howard
  • 6,737
  • 1
  • 26
  • 37