4

The Goal:

I am trying to perform some scraping in Python using a headless browser: Selenium with PhantomJs and GhostDriver.

I am using Python 2.7 on a Mac running Mavericks. I work within emacs (although it also didn't work from Terminal). I have already overcome some errors such as "phantomjs - no such file or directory exists", but have got the latest Binaries from here, which promise to be the latest, including a pending patch from the official PhantomJS team.

My Test Script:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

desired_cap = {
        'phantomjs.page.settings.loadImages' : True,
        'phantomjs.page.settings.resourceTimeout' : 10000,
        'phantomjs.page.settings.userAgent' : "my_user_agent"
        }

driver = webdriver.PhantomJS(executable_path= "/usr/local/bin/phantomjs", desired_capabilities=desired_cap)
driver.set_window_size(1024, 768)
driver.get('https://google.com/')

driver.save_screenshot("testing.png")
driver.page_source("source_code.txt")

element = driver.find_element_by_xpath('--*[@id=-gbqfq-]')
element.send_keys('testing')
element.send_keys(Keys.ENTER)

Here is a link to the webdriver's simple explanation: http://selenium.googlecode.com/svn/trunk/docs/api/py/webdriver_phantomjs/selenium.webdriver.phantomjs.webdriver.html

The Error Message:

The error message that I was getting

What I have tried:

I took a simple example from a tutorial before trying to actually perform anything more complicated, but still get errors. One tutorial... a second

I made one last change to the phantomjs service.py file, which I found here. Namely, I changed:

self.process = subprocess.Popen(self.service_args, stdout=self._log, stderr=self._log) to:

self.process = subprocess.Popen(['/usr/bin/env', 'phantomjs', '--webdriver=59202'])

The last arguement --webdriver seems rather arbitrary to my inexperienced eyes. I thought it might correlate to the port that it used by ghostdriver, which is displayed after each run in the ghostdriver.log and is different each time. Because it changes each time, I don't think using it makes sense to have it static in the code - and after trying anyway, it didn't work.

The Question:

Does anybody have any ideas about why the connection is being refused??

n1k31t4
  • 2,745
  • 2
  • 24
  • 38
  • Running this 'working' script exactly as it is from the second tutorial I mentioned produces the same 'Connection Refused' error: Here is where you can get the script from: [Github Download](https://github.com/thayton/taleo_job_scraper/blob/master/scraper.py) – n1k31t4 Sep 23 '15 at 20:54

0 Answers0