Questions tagged [pyppeteer]

Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.

Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.


Pyppeteer is mostly used for:

  1. Generate screenshots and PDFs of pages.
  2. Crawl an SPA and generate pre-rendered content (i.e. "SSR").
  3. Scrape content from websites.
  4. Automate form submission, UI testing, keyboard input, etc.
  5. Create an up-to-date, automated testing environment. Run your tests directly in the latest version of Chrome using the latest JavaScript and browser features.
  6. Capture a timeline trace of your site to help diagnose performance issues.

Resources:

Differences from puppeteer

185 questions
0
votes
1 answer

Pyppeteer for cascading drop down box?

I use python and pyppeteer to craw web page and stucked. A page with 2 Drop down boxs A and B. B's select item is based on A's selection (items retrieved dynamic). my code list list below but do not work await page.select("select#ListA",…
0
votes
1 answer

Puppeteer missing responses and behaving different than Pyppeteer

I wrote a simple program that only logs requests and responses, once with pyppeteer in Python, and (after I ran into the issues I will describe next) once with puppeteer in JavaScript. Here is the JS code: const puppeteer =…
0
votes
1 answer

Capture page evaluate response in a variable in Pyppeteer

I am trying to use page.evaluate in Pyppeteer and capture js script response but I am unable to capture it. In the following code, I am trying to capture the result returned by js script in dimensions variable, but its capturing as None import…
Mahesh
  • 1,117
  • 2
  • 23
  • 42
0
votes
1 answer

Why do I get pyppeteer.errors.PageError when using requests_html?

I'm scraping a list of similar webpages and sometimes get an error (see at the end). The code I use: from requests_html import HTMLSession import pyppdf.patch_pyppeteer link =…
0
votes
0 answers

Python Pyppeter Unable to Scrape RU retailers

Hello good day stackoverflow pips, Issue: stack and data was never scraped in a russian retailer which is in this case www.vseinstrumenti.ru code: import asyncio from pyppeteer import launch class PyppeteerRequests: def __init__(self): …
0
votes
1 answer

pyppeteer- How to goto next page by clicking sub link (href) in a page using python pyppeteer

below code which launches the browser and that site for URL from pyppeteer import launch browser = await launch({"autoClose":False,'headless': False}) page = await browser.newPage() await page.goto('some url') after loading page, I need to…
0
votes
4 answers

Docker container auto healing is Kubernetes suitable for one instance?

I have one docker container what is running pyppeteer. It have memory leak, so it will stoped in 24 hours. I need some auto healing system, I think Kubernetes can do that. No loadbalance, just one instance, one container. It is…
Joon
  • 130
  • 1
  • 13
0
votes
2 answers

Keep session when using requests_html's render function

I have a small internal webpage that requires a log in. When logged in, a simple HTML page is loaded, and there are javascript scripts that load the actual content of the pages. I want to: Log into the page Run the javascript Extract information…
MrBerta
  • 2,457
  • 12
  • 24
0
votes
1 answer

Pyppeteer fails to download headless chrome when running on AWS Lambda

Pyppeteer (python port of puppeteer) is trying to download linux-chrome but fails to download. This is a python project, that I have dockerized and used serverless to deploy into an AWS Lambda. I'm using serverless to deploy the python dependencies…
Dynomike
  • 151
  • 2
  • 8
0
votes
1 answer

Serverless invoke returns "Unable to marshal response: OSError(30, 'Read-only file system') for my Python lambda

When running my python-based aws lambda, I get a read-only file system error. But, I'm not doing any logging, it looks like serverless is. { "errorMessage": "Unable to marshal response: OSError(30, 'Read-only file system') is not JSON…
0
votes
2 answers

Where is the headless chrome browser in a Google App Engine

I am looking for the location of an executable in an Google App Engine (standard environment). The reason is that I am trying to use pyppeteer for some work but pyppeteer always downloads chromium into a custom folder and then exits. I saw that…
dakes
  • 197
  • 1
  • 1
  • 12
0
votes
1 answer

Unable to let my script perform all the clicks on the next page button

I've created a script in python using pyppeteer to collect the names of different institutions traversing multiple pages from a website. What I wish to do is let my script rove different pages by clicking on the next page button while parsing the…
robots.txt
  • 96
  • 2
  • 10
  • 36
0
votes
1 answer

Getting element content in shadow roots with Pyppeteer

I have a JS-path of object that I am interested in. This path contains a lot of shadow-roots. I am trying to get element content with python google-headless API. Due to shadow-roots I cant use page.querySelector. So, probably I have to execute…
0
votes
1 answer

How to scrapr active data generated by js on a map

I'm new python user and I want to scrape data from this website: https://www.telerad.be/Html5Viewer/index.html?viewer=telerad_fr My problem is that the data are dynamically generated. I read few possibilities to fix but none is satisfying. With…
0
votes
1 answer

Is there a way to scroll to end of page in pyppeteer

I have tried looking in documentation and such, but not able to find a way to scroll down to bottom of page while using pyppeteer library with python3. Would be great if anyone could point me to the right direction or solution.
1 2 3
12
13