0

Every time I am trying to execute the code, I am facing this error. I am using scrapy-playwright. It is also not able to launch the browser. What is the reason behind this and how to solve this issue?

2022-04-08 17:50:53 [scrapy.core.engine] INFO: Spider opened
2022-04-08 17:50:53 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-04-08 17:50:53 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2022-04-08 17:50:53 [scrapy-playwright] INFO: Launching browser
2022-04-08 17:50:53 [scrapy-playwright] INFO: Launching browser
2022-04-08 17:51:23 [scrapy.utils.signal] ERROR: Error caught on signal handler: <bound method ScrapyPlaywrightDownloadHandler._engine_started of <scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler object at 0x7fc184e49180>>
Traceback (most recent call last):
  File "/home/raisulrana/anaconda3/envs/scrapy/lib/python3.10/site-packages/twisted/internet/defer.py", line 1030, in adapt
    extracted = result.result()
  File "/home/raisulrana/anaconda3/envs/scrapy/lib/python3.10/site-packages/scrapy_playwright/handler.py", line 130, in _launch_browser
    self.browser = await browser_launcher(**self.launch_options)
  File "/home/raisulrana/anaconda3/envs/scrapy/lib/python3.10/site-packages/playwright/async_api/_generated.py", line 11633, in launch
    await self._async(
  File "/home/raisulrana/anaconda3/envs/scrapy/lib/python3.10/site-packages/playwright/_impl/_browser_type.py", line 90, in launch
    Browser, from_channel(await self._channel.send("launch", params))
  File "/home/raisulrana/anaconda3/envs/scrapy/lib/python3.10/site-packages/playwright/_impl/_connection.py", line 39, in send
    return await self.inner_send(method, params, False)
  File "/home/raisulrana/anaconda3/envs/scrapy/lib/python3.10/site-packages/playwright/_impl/_connection.py", line 63, in inner_send
    result = next(iter(done)).result()
playwright._impl._api_types.TimeoutError: Timeout 30000ms exceeded.
=========================== logs ===========================
<launching> /home/raisulrana/.cache/ms-playwright/chromium-956323/chrome-linux/chrome --disable-background-networking --enable-features=NetworkService,NetworkServiceInProcess --disable-background-timer-throttling --disable-backgrounding-occluded-windows --disable-breakpad --disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-default-apps --disable-dev-shm-usage --disable-extensions --disable-features=ImprovedCookieControls,LazyFrameLoading,GlobalMediaControls,DestroyProfileOnBrowserClose,MediaRouter,AcceptCHFrame,AutoExpandDetailsElement --allow-pre-commit-input --disable-hang-monitor --disable-ipc-flooding-protection --disable-popup-blocking --disable-prompt-on-repost 
--disable-renderer-backgrounding --disable-sync --force-color-profile=srgb --metrics-recording-only --no-first-run --enable-automation --password-store=basic --use-mock-keychain --no-service-autorun --export-tagged-pdf --no-sandbox --user-data-dir=/tmp/playwright_chromiumdev_profile-MxuOHN --remote-debugging-pipe --no-startup-window
<launched> pid=15139
[pid=15139][err] [0408/175057.243838:ERROR:exception_handler_server.cc(361)] getsockopt: Invalid argument (22)
============================================================
2022-04-08 17:51:23 [scrapy.utils.signal] ERROR: Error caught on signal handler: <bound method ScrapyPlaywrightDownloadHandler._engine_started of <scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler object at 0x7fc1851456f0>>
Traceback (most recent call last):
  File "/home/raisulrana/anaconda3/envs/scrapy/lib/python3.10/site-packages/twisted/internet/defer.py", line 1030, in adapt
    extracted = result.result()
  File "/home/raisulrana/anaconda3/envs/scrapy/lib/python3.10/site-packages/scrapy_playwright/handler.py", line 130, in _launch_browser
    self.browser = await browser_launcher(**self.launch_options)
  File "/home/raisulrana/anaconda3/envs/scrapy/lib/python3.10/site-packages/playwright/async_api/_generated.py", line 11633, in launch
    await self._async(
  File "/home/raisulrana/anaconda3/envs/scrapy/lib/python3.10/site-packages/playwright/_impl/_browser_type.py", line 90, in launch
    Browser, from_channel(await self._channel.send("launch", params))
  File "/home/raisulrana/anaconda3/envs/scrapy/lib/python3.10/site-packages/playwright/_impl/_connection.py", line 39, in send
    return await self.inner_send(method, params, False)
  File "/home/raisulrana/anaconda3/envs/scrapy/lib/python3.10/site-packages/playwright/_impl/_connection.py", line 63, in inner_send
    result = next(iter(done)).result()
playwright._impl._api_types.TimeoutError: Timeout 30000ms exceeded.
=========================== logs ===========================
Raisul Islam
  • 277
  • 2
  • 19
  • Can you provide an example of the code that you're working with? It looks to me that you're likely experiencing an error with scraping the site given it's dynamically loading. You're not giving enough time to playwright to produce its method coroutine. – dollar bill Apr 10 '22 at 17:23
  • @dollarbill the default scrapy-playwright settings are working fine. but for some reason, I need to set the headless=false. And that is the issue. For the last couple of days whenever I set the headless=false the error occurs. You can check out the below-mentioned example which is working fine with the default settings. Means headless = true. https://pastebin.com/raw/FRPF1bQz – Raisul Islam Apr 10 '22 at 18:31

1 Answers1

1

Install these first

playwright

scrapy-playwright

Scrapy

And then run in the terminal

playwright install (This will install the driver required for playwright)