I am trying to get selenium to work with Python for web scraping purposes in Google Colab. I countered the following errors:
First I install all the required libraries:
!pip3 install -U selenium
!pip3 install webdriver-manager
!apt-get install -y chromium-browser
!apt install chromium-chromedriver
This gave me this output:
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: selenium in /usr/local/lib/python3.8/dist-packages (4.8.2)
Requirement already satisfied: trio-websocket~=0.9 in /usr/local/lib/python3.8/dist-packages (from selenium) (0.9.2)
Requirement already satisfied: certifi>=2021.10.8 in /usr/local/lib/python3.8/dist-packages (from selenium) (2022.12.7)
Requirement already satisfied: trio~=0.17 in /usr/local/lib/python3.8/dist-packages (from selenium) (0.22.0)
Requirement already satisfied: urllib3[socks]~=1.26 in /usr/local/lib/python3.8/dist-packages (from selenium) (1.26.14)
Requirement already satisfied: sortedcontainers in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (2.4.0)
Requirement already satisfied: outcome in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (1.2.0)
Requirement already satisfied: exceptiongroup>=1.0.0rc9 in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (1.1.0)
Requirement already satisfied: attrs>=19.2.0 in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (22.2.0)
Requirement already satisfied: sniffio in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (1.3.0)
Requirement already satisfied: idna in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (2.10)
Requirement already satisfied: async-generator>=1.9 in /usr/local/lib/python3.8/dist-packages (from trio~=0.17->selenium) (1.10)
Requirement already satisfied: wsproto>=0.14 in /usr/local/lib/python3.8/dist-packages (from trio-websocket~=0.9->selenium) (1.2.0)
Requirement already satisfied: PySocks!=1.5.7,<2.0,>=1.5.6 in /usr/local/lib/python3.8/dist-packages (from urllib3[socks]~=1.26->selenium) (1.7.1)
Requirement already satisfied: h11<1,>=0.9.0 in /usr/local/lib/python3.8/dist-packages (from wsproto>=0.14->trio-websocket~=0.9->selenium) (0.14.0)
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: webdriver-manager in /usr/local/lib/python3.8/dist-packages (3.8.5)
Requirement already satisfied: packaging in /usr/local/lib/python3.8/dist-packages (from webdriver-manager) (23.0)
Requirement already satisfied: python-dotenv in /usr/local/lib/python3.8/dist-packages (from webdriver-manager) (0.21.1)
Requirement already satisfied: requests in /usr/local/lib/python3.8/dist-packages (from webdriver-manager) (2.25.1)
Requirement already satisfied: tqdm in /usr/local/lib/python3.8/dist-packages (from webdriver-manager) (4.64.1)
Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests->webdriver-manager) (4.0.0)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests->webdriver-manager) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests->webdriver-manager) (2022.12.7)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests->webdriver-manager) (1.26.14)
Reading package lists... Done
Building dependency tree
Reading state information... Done
chromium-browser is already the newest version (1:85.0.4183.83-0ubuntu0.20.04.2).
The following package was automatically installed and is no longer required:
libnvidia-common-510
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 21 not upgraded.
Reading package lists... Done
Building dependency tree
Reading state information... Done
chromium-chromedriver is already the newest version (1:85.0.4183.83-0ubuntu0.20.04.2).
The following package was automatically installed and is no longer required:
libnvidia-common-510
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 21 not upgraded.
Then I tried this code:
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument('--no-sandbox')
options.add_argument('--headless')
options.add_argument('--disable-gpu')
options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(executable_path = '/usr/lib/chromium-browser/chromedriver', options=options)
driver.get("https://www.google.com")
It gave me this error:
<ipython-input-19-aff70976e36a>:8: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
driver = webdriver.Chrome(executable_path = '/usr/lib/chromium-browser/chromedriver', options=options)
---------------------------------------------------------------------------
WebDriverException Traceback (most recent call last)
<ipython-input-19-aff70976e36a> in <module>
6 options.add_argument('--disable-gpu')
7 options.add_argument('--disable-dev-shm-usage')
----> 8 driver = webdriver.Chrome(executable_path = '/usr/lib/chromium-browser/chromedriver', options=options)
9 driver.get("https://www.google.com")
3 frames
/usr/local/lib/python3.8/dist-packages/selenium/webdriver/common/service.py in assert_process_still_running(self)
115 return_code = self.process.poll()
116 if return_code:
--> 117 raise WebDriverException(f"Service {self.path} unexpectedly exited. Status code was: {return_code}")
118
119 def is_connectable(self) -> bool:
WebDriverException: Message: Service /usr/lib/chromium-browser/chromedriver unexpectedly exited. Status code was: 1
So I tried to use the service object instead using this code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
options = Options()
options.add_argument("--no-sandbox")
options.add_argument('--headless')
options.add_argument('--disable-gpu')
options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
driver.get("https://www.google.com")
This gave me this error:
---------------------------------------------------------------------------
WebDriverException Traceback (most recent call last)
<ipython-input-20-5092e72d4ef5> in <module>
9 options.add_argument('--disable-gpu')
10 options.add_argument('--disable-dev-shm-usage')
---> 11 driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
12 driver.get("https://www.google.com")
5 frames
/usr/local/lib/python3.8/dist-packages/selenium/webdriver/remote/errorhandler.py in check_response(self, response)
243 alert_text = value["alert"].get("text")
244 raise exception_class(message, screen, stacktrace, alert_text) # type: ignore[call-arg] # mypy is not smart enough here
--> 245 raise exception_class(message, screen, stacktrace)
WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally.
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /usr/bin/chromium-browser is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Stacktrace:
#0 0x559a65f56d93 <unknown>
#1 0x559a65d252d7 <unknown>
#2 0x559a65d4dab0 <unknown>
#3 0x559a65d49a3d <unknown>
#4 0x559a65d8e4f4 <unknown>
#5 0x559a65d85353 <unknown>
#6 0x559a65d54e40 <unknown>
#7 0x559a65d56038 <unknown>
#8 0x559a65faa8be <unknown>
#9 0x559a65fae8f0 <unknown>
#10 0x559a65f8ef90 <unknown>
#11 0x559a65fafb7d <unknown>
#12 0x559a65f80578 <unknown>
#13 0x559a65fd4348 <unknown>
#14 0x559a65fd44d6 <unknown>
#15 0x559a65fee341 <unknown>
#16 0x7f8b42564609 start_thread
Any idea how to solve this?