0

I am newly assigned to this shared hosting (ssh access only). Have no sudo or yum facility. I am trying to scrape some data from a dynamically loaded website [So, can't use scrapy or bs]. When I am using Selenium, it gives an error:

/home/sanelywr/voyager-bot/app.py:28: DeprecationWarning: executable_path has been deprecated, please pass in a Service object  
  driver = webdriver.Firefox(options=options, executable_path=gecko_path)  
Traceback (most recent call last):  
  File "/home/sanelywr/voyager-bot/app.py", line 28, in <module>  
    driver = webdriver.Firefox(options=options, executable_path=gecko_path)  
  File "/home/sanelywr/virtualenv/voyager-bot/3.9/lib/python3.9/site-packages/selenium/webdriver/firefox/webdriver.py", line 180, in __init__
    RemoteWebDriver.__init__(  
  File "/home/sanelywr/virtualenv/voyager-bot/3.9/lib/python3.9/site-  packages/selenium/webdriver/remote/webdriver.py", line 275, in __init__  
    self.start_session(capabilities, browser_profile)  
  File "/home/sanelywr/virtualenv/voyager-bot/3.9/lib/python3.9/site-  packages/selenium/webdriver/remote/webdriver.py", line 365, in start_session  
    response = self.execute(Command.NEW_SESSION, parameters)  
  File "/home/sanelywr/virtualenv/voyager-bot/3.9/lib/python3.9/site-  packages/selenium/webdriver/remote/webdriver.py", line 430, in execute  
    self.error_handler.check_response(response)   
  File "/home/sanelywr/virtualenv/voyager-bot/3.9/lib/python3.9/site- packages/selenium/webdriver/remote/errorhandler.py", line 247, in check_response  
    raise exception_class(message, screen, stacktrace)  
selenium.common.exceptions.WebDriverException: Message: Process unexpectedly closed with status 1

When checking the geckodriver.log it says,

1653393333249   geckodriver INFO    Listening on 127.0.0.1:33457  
1653393333256   mozrunner::runner   INFO    Running command: "/home/sanelywr/voyager-bot/firefox/firefox" "--marionette" "--headless" "--no-sandbox" "-headless" "--disable-dev-shm-usage" "--remote-debugging-port" "37442" "--remote-allow-hosts" "localhost" "-no-remote" "-profile" "/tmp/rust_mozprofilei8tj67"   

(firefox:28332): Gtk-WARNING **: 07:55:33.285: Locale not supported by C library.  
    Using the fallback 'C' locale.  
**Error: no display specified**

Apparently, I have no screen on this server as I have checked by the following command:

$ echo $DISPLAY

It displays nothing.

I have already set the browser option to headless mode [within the python code].

And, I have also done the following:

$ export MOZ_HEADLESS = 1

And yes, I cannot use the virtual screen like xvfb because even when I am able to pip install the related wrapper, I still cannot install xvfb because I don't have sudo or yum [and cannot find a .tar.bz2 or tar.gz file].

How can I run the selenium browser without having to open the browser on any screen altogether? [Tried PhantomJS() too, even that's not working]

My python script :

import os  
from credentials import API_KEY, API_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET  
#from pyvirtualdisplay.xephyr import XephyrDisplay  
#from pyvirtualdisplay import Display  
from selenium import webdriver  
from selenium.webdriver.firefox.options import Options  
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary  
import tweepy  

auth = tweepy.OAuthHandler(API_KEY, API_SECRET)  
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)  
api = tweepy.API(auth)  
options = Options()  
options.add_argument("--headless")  
options.add_argument('--no-sandbox')  
options.headless= True  
#options = webdriver.ChromeOptions()  
#options.add_argument('--no-sandbox')  
options.add_argument("--disable-dev-shm-usage")  
#options.binary_location = '/home/sanelywr/voyager-bot'  
#driver = webdriver.Chrome(executable_path='./chromedriver', options=options)  
#driver = webdriver.PhantomJS()  
gecko_path = '/home/sanelywr/voyager-bot/geckodriver'  
options.binary=FirefoxBinary('/home/sanelywr/voyager-bot/firefox/firefox')  
#display = XephyrDisplay()  
#display = Display(visible=0, size=(800, 600))  
#display.start()  
driver = webdriver.Firefox(options=options, executable_path=gecko_path)  
url=""  
driver.get(url)  

#<-----------Some code for extracting some elements from the page----------------->

#driver.quit()  
#display.stop()

My script folder:

drwxr-xr-x 4 sanelywr sanelywr 4.0K Jan 24  2016 phantomjs-2.1.1-linux-x86_64  
-rw-rw-r-- 1 sanelywr sanelywr  23M Jan 24  2016 phantomjs-2.1.1-linux-x86_64.tar.bz2  
drwxr-xr-x 7 sanelywr sanelywr 4.0K Apr  7  2016 firefox  
-rw-rw-r-- 1 sanelywr sanelywr  50M Apr 11  2016 firefox-45.0.2.tar.bz2  
-rwxr-xr-x 1 sanelywr sanelywr 8.3M Apr  6 11:54 geckodriver  
drwxr-xr-x 2 sanelywr sanelywr 4.0K May 23 02:09 public  
drwxr-xr-x 2 sanelywr sanelywr 4.0K May 23 02:09 tmp  
-rw-r--r-- 1 sanelywr sanelywr  145 May 23 02:09 passenger_wsgi.py  
-rw-r--r-- 1 sanelywr sanelywr  326 May 23 02:30 credentials.py  
-rw-r--r-- 1 sanelywr sanelywr    0 May 23 02:31 temp.txt  
-rw-r--r-- 1 sanelywr sanelywr   16 May 23 02:36 requirements.txt  
drwxrwxr-x 2 sanelywr sanelywr 4.0K May 23 06:29 __pycache__  
-rw-r--r-- 1 sanelywr sanelywr 8.3M May 24 05:26 geckodriver-v0.31.0-linux64.tar  
-rw-r--r-- 1 sanelywr sanelywr 1.9K May 24 07:24 app.py  
-rw-rw-r-- 1 sanelywr sanelywr 4.8K May 24 07:55 geckodriver.log  

0 Answers0