Questions tagged [pyppeteer]

Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.

Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.


Pyppeteer is mostly used for:

  1. Generate screenshots and PDFs of pages.
  2. Crawl an SPA and generate pre-rendered content (i.e. "SSR").
  3. Scrape content from websites.
  4. Automate form submission, UI testing, keyboard input, etc.
  5. Create an up-to-date, automated testing environment. Run your tests directly in the latest version of Chrome using the latest JavaScript and browser features.
  6. Capture a timeline trace of your site to help diagnose performance issues.

Resources:

Differences from puppeteer

185 questions
4
votes
0 answers

How to continue intercepting requests with pyppeteer?

I've been trying to make a program that actively intercepts requests and returns the response body of those requests as a user browses a site (and performs requests). It seems that the current code only intercepts the details of requests when it…
4
votes
1 answer

Emoji convered to greyed out in PDF output from Chrome Headless

(Note, even though this mention Pyppeteer, the Python version of Puppeteer, the code is exactly the same and works with either Puppeteer and Pyppeeteer). Hi, I'm converting the page http://getemoji.com/ into PDF using the following code : import…
Cyril N.
  • 38,875
  • 36
  • 142
  • 243
4
votes
1 answer

Can't go on clicking on the next page button while scraping certain fields from a website

I've created a script using python in association with pyppeteer to keep clicking on the next page button until there is no more. The script while clicking on the next page button throws this error pyppeteer.errors.TimeoutError: Navigation Timeout…
robots.txt
  • 96
  • 2
  • 10
  • 36
4
votes
1 answer

RuntimeError: Event loop is closed

I'm trying to marry pyppeteer and quart, but since starting the browser takes a lot of time, I'd rather handle it globally (with an async lock), which seems to mean that I need to handle cleanup manually. Here's my minimal code…
d33tah
  • 10,999
  • 13
  • 68
  • 158
3
votes
0 answers

Pyppeteer connection closed after a minute

Good day everyone. I ran this code and it works perfectly well.the main purpose is to capture websocket traffic and the problem is that it closes after a minute or there about.. please how can I fix this.. I want it to stay alive forever import…
Bombosonic
  • 61
  • 5
3
votes
0 answers

Keep Pyppeteer browser open indefinitely between Flask requests

By default Pyppeteer opens a new browser for every screenshot, on my setup, this increases the screenshot time by 100% compared to having the browser open. Therefore, my question is: How would I keep Chrome/Pyppeteer browser open (globally) and just…
JamesRicky
  • 201
  • 1
  • 3
  • 17
3
votes
1 answer

Python pyppeteer Intercept/Capture Network Requests

Hi I am trying to intercept all the network calls for a given url using pyppeteer, my code: import asyncio from pyppeteer import launch import pickle async def interceptResponse(response): print("printing response") print(response) …
Pyd
  • 6,017
  • 18
  • 52
  • 109
3
votes
1 answer

Pyppeteer: {'waitUntil': 'networkidle0'} not waiting till page is loaded

So if I use await page.waitFor(9000) or some hard coded wait number, my function will wait till page loads. However, await page.goto(url, {'waitUntil': 'networkidle0'}) results in function running before entire page loads, so script fails. Here is…
MasayoMusic
  • 594
  • 1
  • 6
  • 24
3
votes
1 answer

what does a "RuntimeError: There is no current event loop in thread 'Thread-2'." error mean?

So I've been trying to make a simple bitcoin price checker with pyppeteer. It's working like a charm but whenever I try to implement it to flask I get a runtime error. Essentially, I want to build a web api call that whenever I click a button it…
Gabriel Gavrilov
  • 337
  • 3
  • 11
3
votes
0 answers

pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded. When trying to convert a jupyter notebook to pdf

Im using nbconvert to convert a notebook to pdf using the following code: (base) C:\Users\25470\Desktop\Data Projects\Quantium Virtual Internship\Assingment 2>jupyter nbconvert task2.ipynb --to webpdf I get the following output : [NbConvertApp]…
Youssef Razak
  • 365
  • 4
  • 11
3
votes
0 answers

Pyppeteer how to login on page with type

I was using selenium + chrome driver for my python telegram bot deployed on linux server with docker. Everything is working, but its not supporting async so my app can't do anything else during scraping. I heard about Pyppeteer, but having some…
3
votes
3 answers

Pyppeteer Browser closed unexpectedly in heroku

I recently deployed an app in heroku . It uses python pyppeteer package. I didnt had any issues while testing on repl.it. But unfortunately in heroku the browser keeps crashing. I used requirement.txt for installing pyppeteer package. I also tried…
Alen Paul Varghese
  • 1,278
  • 14
  • 27
3
votes
0 answers

Pyppeteer: Detect navigation on Page.click method

I have the following piece of code which loads a page and follows a link within it, using asyncio.gather as recommended in the documentation for the click method: import asyncio import pyppeteer async def main(selector): browser = await…
3
votes
0 answers

Too many open files error when using asyncio/pyppeteer

I'm trying to make requests with headless chrome using pyppeteer. But I keep getting "OSError: [Errno 24] Too many open files" after a certain amount of requests. I checked the open resources of the python process with losf and found out that with…
Serwj
  • 43
  • 1
  • 8
3
votes
1 answer

Scraping content using pyppeteer in association with asyncio

I've written a script in python in combination with pyppeteer along with asyncio to scrape the links of different posts from its landing page and eventually get the title of each post by tracking the url leading to its inner page. The content I…
robots.txt
  • 96
  • 2
  • 10
  • 36
1
2
3
12 13