6

I have this simple example:

require 'watir-webdriver'

arr = []
sites = [
"www.google.com",
"www.bbc.com",
"www.cnn.com",
"www.gmail.com"
]

sites.each do |site|
    arr << Thread.new {
        b = Watir::Browser.new :chrome
        b.goto site
        puts b.url
        b.close
    }
end
arr.each {|t| t.join}

Every time i run this script, I get

ruby/2.1.0/net/http.rb:879:in `initialize': Connection refused - connect(2) for "127.0.0.1"      port 9517 (Errno::ECONNREFUSED)

Or one of the browsers closes unexpectedly on atleast one of the threads.

on the other hand, if i set sleep 2 at the end of every loop cycle, everything runs smoothly! Any idea why is that?

Must be something related to understanding how threads work...

MichaelR
  • 969
  • 14
  • 36

1 Answers1

4

You're basically creating a race condition between the instances of your browser to connect to the open port watir-webdriver is finding. In this case, your first instance of the browser sees that port 9517 is open and connects to it. Because you're spinning up these instances in parallel, your second instance also thinks port 9517 is open and tries to connect. But oops, that port is already being used by the first browser instance. That's why you get this particular error.

This also explains why the sleep 2 fixes the issue. The first browser instance connects to port 9517 and the sleep causes the second browser instance to see that 9517 is taken. It then connects on port 9518.

EDIT

You can see how this is implemented with Selenium::WebDriver::Chrome::Service#initialize (here), which calls Selenium::WebDriver::PortProber (here). PortProber is how the webdriver determines which port is open.

Johnson
  • 1,510
  • 8
  • 15
  • So I am running into a similar situation while I'm crawling websites, and when I throw too many workers at it, I start getting the same errors. Is there any way to "protect" the initial connection between the browser and the open port, so that it is ensured? Being slower isn't a problem, but sleeping wont' work when I have a giant queue and the amount of time it takes to complete each job is somewhat random.. inevitably, I'll fall into this problem.... – kindofgreat Sep 09 '15 at 17:56
  • I guess I could keep the browsers open and going to new URLs, intsead of opening and closing a browser instance for each URL. So then I could just have a set number of concurrent workers, say 10, and have a long enough sleep period between those 10 to get things started, then the workers are instances within those ten workers. Not quite as elegant, and a bit more complicated to scale... – kindofgreat Sep 09 '15 at 21:07
  • I'm getting the same error trying to `#close` the browser. Any ideas? – akostadinov Apr 27 '16 at 07:45
  • Can you post a question of your own with a code example? I have a hunch but it's not worth exploring without knowing how you call `#close`. – Johnson Apr 27 '16 at 13:29