I am newly assigned to this shared hosting (ssh access only). Have no sudo or yum facility. I am trying to scrape some data from a dynamically loaded website [So, can't use scrapy or bs]. When I am using Selenium, it gives an error:
/home/sanelywr/voyager-bot/app.py:28: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
driver = webdriver.Firefox(options=options, executable_path=gecko_path)
Traceback (most recent call last):
File "/home/sanelywr/voyager-bot/app.py", line 28, in <module>
driver = webdriver.Firefox(options=options, executable_path=gecko_path)
File "/home/sanelywr/virtualenv/voyager-bot/3.9/lib/python3.9/site-packages/selenium/webdriver/firefox/webdriver.py", line 180, in __init__
RemoteWebDriver.__init__(
File "/home/sanelywr/virtualenv/voyager-bot/3.9/lib/python3.9/site- packages/selenium/webdriver/remote/webdriver.py", line 275, in __init__
self.start_session(capabilities, browser_profile)
File "/home/sanelywr/virtualenv/voyager-bot/3.9/lib/python3.9/site- packages/selenium/webdriver/remote/webdriver.py", line 365, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/home/sanelywr/virtualenv/voyager-bot/3.9/lib/python3.9/site- packages/selenium/webdriver/remote/webdriver.py", line 430, in execute
self.error_handler.check_response(response)
File "/home/sanelywr/virtualenv/voyager-bot/3.9/lib/python3.9/site- packages/selenium/webdriver/remote/errorhandler.py", line 247, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: Process unexpectedly closed with status 1
When checking the geckodriver.log it says,
1653393333249 geckodriver INFO Listening on 127.0.0.1:33457
1653393333256 mozrunner::runner INFO Running command: "/home/sanelywr/voyager-bot/firefox/firefox" "--marionette" "--headless" "--no-sandbox" "-headless" "--disable-dev-shm-usage" "--remote-debugging-port" "37442" "--remote-allow-hosts" "localhost" "-no-remote" "-profile" "/tmp/rust_mozprofilei8tj67"
(firefox:28332): Gtk-WARNING **: 07:55:33.285: Locale not supported by C library.
Using the fallback 'C' locale.
**Error: no display specified**
Apparently, I have no screen on this server as I have checked by the following command:
$ echo $DISPLAY
It displays nothing.
I have already set the browser option to headless mode [within the python code].
And, I have also done the following:
$ export MOZ_HEADLESS = 1
And yes, I cannot use the virtual screen like xvfb because even when I am able to pip install the related wrapper, I still cannot install xvfb because I don't have sudo or yum [and cannot find a .tar.bz2 or tar.gz file].
How can I run the selenium browser without having to open the browser on any screen altogether? [Tried PhantomJS() too, even that's not working]
My python script :
import os
from credentials import API_KEY, API_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET
#from pyvirtualdisplay.xephyr import XephyrDisplay
#from pyvirtualdisplay import Display
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
import tweepy
auth = tweepy.OAuthHandler(API_KEY, API_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
api = tweepy.API(auth)
options = Options()
options.add_argument("--headless")
options.add_argument('--no-sandbox')
options.headless= True
#options = webdriver.ChromeOptions()
#options.add_argument('--no-sandbox')
options.add_argument("--disable-dev-shm-usage")
#options.binary_location = '/home/sanelywr/voyager-bot'
#driver = webdriver.Chrome(executable_path='./chromedriver', options=options)
#driver = webdriver.PhantomJS()
gecko_path = '/home/sanelywr/voyager-bot/geckodriver'
options.binary=FirefoxBinary('/home/sanelywr/voyager-bot/firefox/firefox')
#display = XephyrDisplay()
#display = Display(visible=0, size=(800, 600))
#display.start()
driver = webdriver.Firefox(options=options, executable_path=gecko_path)
url=""
driver.get(url)
#<-----------Some code for extracting some elements from the page----------------->
#driver.quit()
#display.stop()
My script folder:
drwxr-xr-x 4 sanelywr sanelywr 4.0K Jan 24 2016 phantomjs-2.1.1-linux-x86_64
-rw-rw-r-- 1 sanelywr sanelywr 23M Jan 24 2016 phantomjs-2.1.1-linux-x86_64.tar.bz2
drwxr-xr-x 7 sanelywr sanelywr 4.0K Apr 7 2016 firefox
-rw-rw-r-- 1 sanelywr sanelywr 50M Apr 11 2016 firefox-45.0.2.tar.bz2
-rwxr-xr-x 1 sanelywr sanelywr 8.3M Apr 6 11:54 geckodriver
drwxr-xr-x 2 sanelywr sanelywr 4.0K May 23 02:09 public
drwxr-xr-x 2 sanelywr sanelywr 4.0K May 23 02:09 tmp
-rw-r--r-- 1 sanelywr sanelywr 145 May 23 02:09 passenger_wsgi.py
-rw-r--r-- 1 sanelywr sanelywr 326 May 23 02:30 credentials.py
-rw-r--r-- 1 sanelywr sanelywr 0 May 23 02:31 temp.txt
-rw-r--r-- 1 sanelywr sanelywr 16 May 23 02:36 requirements.txt
drwxrwxr-x 2 sanelywr sanelywr 4.0K May 23 06:29 __pycache__
-rw-r--r-- 1 sanelywr sanelywr 8.3M May 24 05:26 geckodriver-v0.31.0-linux64.tar
-rw-r--r-- 1 sanelywr sanelywr 1.9K May 24 07:24 app.py
-rw-rw-r-- 1 sanelywr sanelywr 4.8K May 24 07:55 geckodriver.log