-1

I am trying to get selenium to work with Python for web scraping purposes in Google Colab. I countered the following errors:

First I install all the required libraries:

!pip3 install -U selenium
!pip3 install webdriver-manager
!apt-get install -y chromium-browser
!apt install chromium-chromedriver

This gave me this output:

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: selenium in /usr/local/lib/python3.8/dist-packages (4.8.2)
Requirement already satisfied: trio-websocket~=0.9 in /usr/local/lib/python3.8/dist-packages (from selenium) (0.9.2)
Requirement already satisfied: certifi>=2021.10.8 in /usr/local/lib/python3.8/dist-packages (from selenium) (2022.12.7)
Requirement already satisfied: trio~=0.17 in /usr/local/lib/python3.8/dist-packages (from selenium) (0.22.0)
Requirement already satisfied: urllib3[socks]~=1.26 in /usr/local/lib/python3.8/dist-packages (from selenium) (1.26.14)
Requirement already satisfied: sortedcontainers in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (2.4.0)
Requirement already satisfied: outcome in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (1.2.0)
Requirement already satisfied: exceptiongroup>=1.0.0rc9 in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (1.1.0)
Requirement already satisfied: attrs>=19.2.0 in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (22.2.0)
Requirement already satisfied: sniffio in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (1.3.0)
Requirement already satisfied: idna in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (2.10)
Requirement already satisfied: async-generator>=1.9 in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (1.10)
Requirement already satisfied: wsproto>=0.14 in /usr/local/lib/python3.8/dist-packages (from trio-websocket~=0.9->selenium) (1.2.0)
Requirement already satisfied: PySocks!=1.5.7,<2.0,>=1.5.6 in /usr/local/lib/python3.8/dist-packages (from urllib3[socks]~=1.26->selenium) (1.7.1)
Requirement already satisfied: h11<1,>=0.9.0 in /usr/local/lib/python3.8/dist-packages (from wsproto>=0.14->trio-websocket~=0.9->selenium) (0.14.0)
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: webdriver-manager in /usr/local/lib/python3.8/dist-packages (3.8.5)
Requirement already satisfied: packaging in /usr/local/lib/python3.8/dist-packages (from webdriver-manager) (23.0)
Requirement already satisfied: python-dotenv in /usr/local/lib/python3.8/dist-packages (from webdriver-manager) (0.21.1)
Requirement already satisfied: requests in /usr/local/lib/python3.8/dist-packages (from webdriver-manager) (2.25.1)
Requirement already satisfied: tqdm in /usr/local/lib/python3.8/dist-packages (from webdriver-manager) (4.64.1)
Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests->webdriver-manager) (4.0.0)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests->webdriver-manager) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests->webdriver-manager) (2022.12.7)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests->webdriver-manager) (1.26.14)
Reading package lists... Done
Building dependency tree       
Reading state information... Done
chromium-browser is already the newest version (1:85.0.4183.83-0ubuntu0.20.04.2).
The following package was automatically installed and is no longer required:
  libnvidia-common-510
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 21 not upgraded.
Reading package lists... Done
Building dependency tree       
Reading state information... Done
chromium-chromedriver is already the newest version (1:85.0.4183.83-0ubuntu0.20.04.2).
The following package was automatically installed and is no longer required:
  libnvidia-common-510
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 21 not upgraded.

Then I tried this code:

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument('--no-sandbox')
options.add_argument('--headless')
options.add_argument('--disable-gpu')
options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(executable_path = '/usr/lib/chromium-browser/chromedriver', options=options)
driver.get("https://www.google.com")

It gave me this error:

<ipython-input-19-aff70976e36a>:8: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
  driver = webdriver.Chrome(executable_path = '/usr/lib/chromium-browser/chromedriver', options=options)
---------------------------------------------------------------------------
WebDriverException                        Traceback (most recent call last)
<ipython-input-19-aff70976e36a> in <module>
      6 options.add_argument('--disable-gpu')
      7 options.add_argument('--disable-dev-shm-usage')
----> 8 driver = webdriver.Chrome(executable_path = '/usr/lib/chromium-browser/chromedriver', options=options)
      9 driver.get("https://www.google.com")

3 frames
/usr/local/lib/python3.8/dist-packages/selenium/webdriver/common/service.py in assert_process_still_running(self)
    115         return_code = self.process.poll()
    116         if return_code:
--> 117             raise WebDriverException(f"Service {self.path} unexpectedly exited. Status code was: {return_code}")
    118 
    119     def is_connectable(self) -> bool:

WebDriverException: Message: Service /usr/lib/chromium-browser/chromedriver unexpectedly exited. Status code was: 1

So I tried to use the service object instead using this code:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

options = Options()
options.add_argument("--no-sandbox")
options.add_argument('--headless')
options.add_argument('--disable-gpu')
options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
driver.get("https://www.google.com")

This gave me this error:

---------------------------------------------------------------------------
WebDriverException                        Traceback (most recent call last)
<ipython-input-20-5092e72d4ef5> in <module>
      9 options.add_argument('--disable-gpu')
     10 options.add_argument('--disable-dev-shm-usage')
---> 11 driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
     12 driver.get("https://www.google.com")

5 frames
/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/errorhandler.py in check_response(self, response)
    243                 alert_text = value["alert"].get("text")
    244             raise exception_class(message, screen, stacktrace, alert_text)  # type: ignore[call-arg]  # mypy is not smart enough here
--> 245         raise exception_class(message, screen, stacktrace)

WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally.
  (unknown error: DevToolsActivePort file doesn't exist)
  (The process started from chrome location /usr/bin/chromium-browser is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Stacktrace:
#0 0x559a65f56d93 <unknown>
#1 0x559a65d252d7 <unknown>
#2 0x559a65d4dab0 <unknown>
#3 0x559a65d49a3d <unknown>
#4 0x559a65d8e4f4 <unknown>
#5 0x559a65d85353 <unknown>
#6 0x559a65d54e40 <unknown>
#7 0x559a65d56038 <unknown>
#8 0x559a65faa8be <unknown>
#9 0x559a65fae8f0 <unknown>
#10 0x559a65f8ef90 <unknown>
#11 0x559a65fafb7d <unknown>
#12 0x559a65f80578 <unknown>
#13 0x559a65fd4348 <unknown>
#14 0x559a65fd44d6 <unknown>
#15 0x559a65fee341 <unknown>
#16 0x7f8b42564609 start_thread

Any idea how to solve this?

Reem
  • 107
  • 1
  • 7

1 Answers1

-1

I solve it using this code:

!pip install selenium
%%shell

# Add debian buster
cat > /etc/apt/sources.list.d/debian.list <<'EOF'
deb [arch=amd64 signed-by=/usr/share/keyrings/debian-buster.gpg] http://deb.debian.org/debian buster main
deb [arch=amd64 signed-by=/usr/share/keyrings/debian-buster-updates.gpg] http://deb.debian.org/debian buster-updates main
deb [arch=amd64 signed-by=/usr/share/keyrings/debian-security-buster.gpg] http://deb.debian.org/debian-security buster/updates main
EOF

# Add keys
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys DCC9EFBF77E11517
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 648ACFD622F3D138
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 112695A0E562B32A

apt-key export 77E11517 | gpg --dearmour -o /usr/share/keyrings/debian-buster.gpg
apt-key export 22F3D138 | gpg --dearmour -o /usr/share/keyrings/debian-buster-updates.gpg
apt-key export E562B32A | gpg --dearmour -o /usr/share/keyrings/debian-security-buster.gpg

# Prefer debian repo for chromium* packages only
# Note the double-blank lines between entries
cat > /etc/apt/preferences.d/chromium.pref << 'EOF'
Package: *
Pin: release a=eoan
Pin-Priority: 500


Package: *
Pin: origin "deb.debian.org"
Pin-Priority: 300


Package: chromium*
Pin: origin "deb.debian.org"
Pin-Priority: 700
EOF

apt-get update
apt-get install chromium chromium-driver
from selenium import webdriver
def web_driver():
    options = webdriver.ChromeOptions()
    options.add_argument("--verbose")
    options.add_argument('--no-sandbox')
    options.add_argument('--headless')
    options.add_argument('--disable-gpu')
    options.add_argument("--window-size=1920, 1200")
    options.add_argument('--disable-dev-shm-usage')
    driver = webdriver.Chrome(options=options)
    return driver
driver = web_driver()

driver.get('https://www.google.com')

driver.quit()
Reem
  • 107
  • 1
  • 7