16

I'm trying to write my first Ruby program, but have a problem. The code has to download 32 MP3 files over HTTP. It actually downloads a few, then times-out.

I tried setting a timeout period, but it makes no difference. Running the code under Windows, Cygwin and Mac OS X has the same result.

This is the code:

require 'rubygems'
require 'open-uri'
require 'nokogiri'
require 'set'
require 'net/http'
require 'uri'

 puts "\n Up and running!\n\n"

 links_set = {}

 pages = ['http://www.vimeo.com/siai/videos/sort:oldest',
   'http://www.vimeo.com/siai/videos/page:2/sort:oldest',
   'http://www.vimeo.com/siai/videos/page:3/sort:oldest']

 pages.each do |page|
  doc = Nokogiri::HTML(open(page))
  doc.search('//*[@href]').each do |m|
   video_id = m[:href]
   if video_id.match(/^\/(\d+)$/i)
     links_set[video_id[/\d+/]] = m.children[0].to_s.split(" at ")[0].split(" -- ")[0]
    end
   end
 end

 links = links_set.to_a

 p links

 cookie = ''
 file_name = ''

 open("http://www.tubeminator.com") {|f|
   cookie = f.meta['set-cookie'].split(';')[0]
 }

 links.each do |link|
  open("http://www.tubeminator.com/ajax.php?function=downloadvideo&url=http%3A%2F%2Fwww.vimeo.com%2F" + link[0],
   "Cookie" => cookie) {|f|
      puts f.read
  } 

  open("http://www.tubeminator.com/ajax.php?function=convertvideo&start=0&duration=1120&size=0&format=mp3&vq=high&aq=high",
   "Cookie" => cookie) {|f|
      file_name = f.read
   }
  puts file_name

  Net::HTTP.start("www.tubeminator.com") { |http|
   #http.read_timeout = 3600 # 1 hour
     resp = http.get("/download-video-" + file_name)
     open(link[1] + ".mp3", "wb") { |file|
        file.write(resp.body)
     }
    }  
 end 

 puts "\n Yay!!"

And this is the exception:

/Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/protocol.rb:140:in `rescue in rbuf_fill': Timeout::Error (Timeout::Error)
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/protocol.rb:134:in `rbuf_fill'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/protocol.rb:116:in `readuntil'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/protocol.rb:126:in `readline'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/http.rb:2138:in `read_status_line'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/http.rb:2127:in `read_new'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/http.rb:1120:in `transport_request'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/http.rb:1106:in `request'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:312:in `block in open_http'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/net/http.rb:564:in `start'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:306:in `open_http'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:767:in `buffer_open'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:203:in `block in open_loop'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:201:in `catch'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:201:in `open_loop'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:146:in `open_uri'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:669:in `open'
 from /Users/test/.rvm/rubies/ruby-1.9.2-preview1/lib/ruby/1.9.1/open-uri.rb:33:in `open'
 from test.rb:38:in `block in <main>'
 from test.rb:37:in `each'
 from test.rb:37:in `<main>'

I'd also appreciate your comments on the rest of the code.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Alan Grey
  • 161
  • 1
  • 1
  • 3
  • Maybe there's something going wrong when you build the url for the video download. Pick the problematic url and try to download it manually. – Lucas Feb 09 '10 at 13:19
  • 1
    Hi Lucas :) The URL is OK, I can download the file with a browser. The problem is it timeouts while downloading larger files(around 20MB). – Alan Grey Feb 09 '10 at 13:22

2 Answers2

21

For Ruby 1.8 I used this to solve my time-out issues. Extending the Net::HTTP class in my code and re-initialized with default parameters including an initialization of my own read_timeout should keep things sane I think.

require 'net/http'

# Lengthen timeout in Net::HTTP
module Net
    class HTTP
        alias old_initialize initialize

        def initialize(*args)
            old_initialize(*args)
            @read_timeout = 5*60     # 5 minutes
        end
    end
end
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Ransom
  • 784
  • 7
  • 14
  • 3
    Thank you, you're like an angel that came from heaven right at the very moment I needed you. – Darren Feb 16 '12 at 18:50
  • @Darren glad i could be of help :-) – Ransom Feb 22 '12 at 23:02
  • Nick, you can put this code anywhere in your application hierarchy so far the file it is saved in is loaded/required at runtime. – Ransom Jul 12 '12 at 10:36
  • 2
    @RansomAni-Gizzle, why did you do that instead of saying `http = Net::HTTP.new(uri.host, uri.port); http.open_timeout = 5* 60; http.read_timeout = 5* 60` like I did here http://stackoverflow.com/questions/14314375/increase-timeout-for-nethttp –  Jan 14 '13 at 11:57
  • 1
    @Grienders because i needed the solution to be modular and instantiated just once. Your solution works as well but if you used the variable http in another file without making it global, you will have to redo the "Net::HTTP.new(uri.host, uri.port); http.open_timeout = 5* 60; http.read_timeout = 5* 60" over and over again in different files when you could just have it in one place where it matters. I would find that messy.. just personal preference i guess. :-) – Ransom Mar 20 '13 at 16:01
  • RansomAni-Gizzle: I think the point of Grienders is that your example does not handle connection timeout(server unreachable or firewalled), just read timeout(too slow to respond after the connection was established properly). I would update it to also have something like @read_timeout = SOME_VALUE which also handles this case. – Cristian Măgherușan-Stanciu Apr 29 '13 at 10:43
  • @CristianMăgherușan-Stanciu Thanks for pointing out what Grienders meant. From the logs posted in the initial question, the error was around read_timeout and my answer sought to address just that. I believe there could be a better answer that goes beyond the scope of the initial question. Just didn't want to kill a fly with a bazooka and confuse anyone further :-) – Ransom May 21 '13 at 17:11
14

Your timeout isn't in the code you set the timeout for. It's here, where you use open-uri:

open("http://www.tubeminator.com/ajax.php?function=downloadvideo&url=http%3A%2F%2Fwww.vimeo.com%2F" + link[0],

You can set a read timeout for open-uri like so:

#!/usr/bin/ruby1.9

require 'open-uri'

open('http://stackoverflow.com', 'r', :read_timeout=>0.01) do |http|
  http.read
end

# => /usr/lib/ruby/1.9.0/net/protocol.rb:135:in `sysread': \
# => execution expired (Timeout::Error)
# => ...
# =>         from /tmp/foo.rb:5:in `<main>'

:read_timeout is new for Ruby 1.9 (it's not in Ruby 1.8). 0 or nil means "no timeout."

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Wayne Conrad
  • 103,207
  • 26
  • 155
  • 191