3

I have been trying to use requests-html in a venv environment (python 3.7.0 - MacOS 10.15.1), however I am dealing with some certificate issue (I'm not behind any proxy/firewall):

The main call is :

from requests_html import HTMLSession
sessao = HTMLSession()
r1 = sessao.get(url=url_inicio)

The exception is raised while running the GET method, like this:

/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/bin/python "/Users/ricardobarroslourenco/Library/Application Support/JetBrains/Toolbox/apps/PyCharm-P/ch-0/192.6817.19/PyCharm.app/Contents/helpers/pydev/pydevd.py" --multiproc --qt-support=auto --client 127.0.0.1 --port 50377 --file /Users/ricardobarroslourenco/PycharmProjects/zarc/zarc_scraper/main.py
pydev debugger: process 9369 is connecting

Connected to pydev debugger (build 192.6817.19)
[W:pyppeteer.chromium_downloader] start chromium download.
Download may take a few minutes.
Traceback (most recent call last):
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
    chunked=chunked,
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 376, in _make_request
    self._validate_conn(conn)
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 994, in _validate_conn
    conn.connect()
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connection.py", line 394, in connect
    ssl_context=context,
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/util/ssl_.py", line 370, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 412, in wrap_socket
    session=session
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 850, in _create
    self.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 1108, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1045)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/requests_html.py", line 714, in browser
    self._browser = await pyppeteer.launch(ignoreHTTPSErrors=not(self.verify), headless=True, args=self.__browser_args)
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/pyppeteer/launcher.py", line 311, in launch
    return await Launcher(options, **kwargs).launch()
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/pyppeteer/launcher.py", line 125, in __init__
    download_chromium()
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/pyppeteer/chromium_downloader.py", line 136, in download_chromium
    extract_zip(download_zip(get_url()), DOWNLOADS_FOLDER / REVISION)
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/pyppeteer/chromium_downloader.py", line 78, in download_zip
    data = http.request('GET', url, preload_content=False)
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/request.py", line 76, in request
    method, url, fields=fields, headers=headers, **urlopen_kw
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/request.py", line 97, in request_encode_url
    return self.urlopen(method, url, **extra_kw)
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/poolmanager.py", line 330, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 760, in urlopen
    **response_kw
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 760, in urlopen
    **response_kw
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 760, in urlopen
    **response_kw
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 720, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/util/retry.py", line 436, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /chromium-browser-snapshots/Mac/575458/chrome-mac.zip (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1045)')))

Any hints on how to solve this issue? The idea is to scrape some websites which cookies are generated with javascript, and requests-html supposedly solves the problem of renderization (that occurs on the regular requests package).

0 Answers0