0

I made a handy little link expander using curl within my ruby (Sintra) app.

  def curbexpand(link) 
    result = Curl::Easy.new(link)
    begin 
      result.headers["User-Agent"] = "..."
      result.verbose = true
      result.follow_location = true
      result.max_redirects = 3
      result.connect_timeout = 5
      result.perform
      return result.last_effective_url # Returns the final destination URL after x redirects...
    rescue
      return link
      puts "XXXXXXXXXXXXXXXXXXX Error parsing link XXXXXXXXXXXXXXXXXXXXXXXXXXX"
    end
  end

The problem I have is that some geniuses are using URL shorteners to link to .exe's and .dmg's which would be fine but it looks like my curl script above is waiting for the full response to be returned (i.e. it could be a 1GB file!) before returning the url. I don't want to use third party link expander API's as I have a significant volume of links to expand.

Anyone know how I can tweak curb to just find the url rather than waiting for the full response?

Colm Troy
  • 1,947
  • 3
  • 22
  • 35

1 Answers1

0

I've done what you want using using Net::HTTP to process "HEAD" requests, and look for redirects that way. The advantage is a HEAD will not return content, only headers.

From the docs:

head(path, initheader = nil) 

Gets only the header from path on the connected-to host. header is a Hash like { ‘Accept’ => ‘/’, … }.

This method returns a Net::HTTPResponse object.

This method never raises an exception.

response = nil
Net::HTTP.start('some.www.server', 80) {|http|
  response = http.head('/index.html')
}
p response['content-type']

Combine that with the example in the Net::HTTP docs for following redirection, and you should be able to find your landing URL.

You can probably use Curl::http_head to accomplish much the same thing.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • nice one Tin Man - I seen Net::HTTP but never took it for a spin - will try it out thanks – Colm Troy Mar 07 '12 at 22:57
  • It's lower level, so you'll have to do some additional work, but for this purpose it's a nice fit. And, since the docs have what you need, it should make it pretty easy to get working. – the Tin Man Mar 07 '12 at 22:59