Questions tagged [pyppeteer]

Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.

Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.


Pyppeteer is mostly used for:

  1. Generate screenshots and PDFs of pages.
  2. Crawl an SPA and generate pre-rendered content (i.e. "SSR").
  3. Scrape content from websites.
  4. Automate form submission, UI testing, keyboard input, etc.
  5. Create an up-to-date, automated testing environment. Run your tests directly in the latest version of Chrome using the latest JavaScript and browser features.
  6. Capture a timeline trace of your site to help diagnose performance issues.

Resources:

Differences from puppeteer

185 questions
3
votes
1 answer

pyppeteer and javascript query

Im trying to migrate a node project, that uses puppeteer, to a python project that uses pyppeteer. I have the below, javascript query, that is working correctly. const values = await page.evaluate( () =>…
Magick
  • 4,603
  • 22
  • 66
  • 103
3
votes
1 answer

Python: Pyppeteer with asyncio

I was doing some tests and I wonder if the script below is running asynchronously? # python test.py It took 1.3439464569091797 seconds. 31 (sites) x 1.34 = 41.54s - so it's a few seconds less but in theory it should take only as long as the…
HTF
  • 6,632
  • 6
  • 30
  • 49
2
votes
0 answers

Replay, with python, a puppeteer script exported from Chrome Recorder

I made a recording with Chrome's recorder, then exported it to a json file: Now I'd like to replay this "script" using python. The de facto package seems to be pyppeteer. I can run scripts written in python itself, but am trying to run the script I…
thorwhalen
  • 1,920
  • 14
  • 26
2
votes
1 answer

Pyppeteer Navigation Timeout Exceeded

EDIT: I decided to run this as headless=False to see what's happening. Reddit is giving me the "Reddit.com wants to show notifications" and it looks like that's causing the hang-up. Does anyone know how to get around that? I'm working on my capstone…
2
votes
1 answer

Using Pyppeteer to download CSV / Excel file from Vanguard via JavaScript

I'm trying to automate downloading the holdings of Vanguard funds from the web. The links resolve through JavaScript so I'm using Pyppeteer but I'm not getting the file. Note, the link says CSV but it provides an Excel file. From my browser it…
2
votes
1 answer

Pyppeteer RequestSetIntercept function : coroutine was never awaited

I am trying to use the RequestSetIntercept function to quicken the loading of webpage with Pyppeteer. However I am getting the warning: RuntimeWarning: coroutine 'block_image' was never awaited I can't figure out where I am missing an await. I've…
MasayoMusic
  • 594
  • 1
  • 6
  • 24
2
votes
0 answers

Pyppeteer high CPU usage (nearly 50% for one browser)

For some odd reason, running a single Pyppeteer headless chrome browser takes up 50% of my CPU usage (Ryzen 5 2600X). With a medium-high end CPU like that, I should be able to handle far more than a single browser. Here are my launch arguments: …
2
votes
0 answers

Pyppeteer not extracting javascript coverage correctly

When I'm using pyppeteer for extracting js coverage, there is missing some parts of the javascript code. What I'm doing is the following: import asyncio import json import os from pyppeteer import launch def process_coverage(coverage): …
Joaco Terniro
  • 115
  • 1
  • 2
  • 13
2
votes
1 answer

python async def how to return value

I am trying to return a list of XHR urls from Python Async. Below is my code. import asyncio from pyppeteer import launch async def intercept_response(res): resourceType = res.request.resourceType xhr_list = [] if resourceType in…
jackliu
  • 41
  • 2
2
votes
1 answer

pyppeteer browser never closes and TimeoutError raises

I am trying to get XHR using Python Pyppeteer. Here is my code. import asyncio from pyppeteer import launch import json async def intercept_response(res): resourceType = res.request.resourceType if resourceType in ['xhr']: resp =…
jackliu
  • 41
  • 2
2
votes
2 answers

How to click on dinamcly generated button using pyppeteer/puppeteer?

I am using Python headless browser library Pyppeteer. It's basically the same as Puppeteer (JS). So solutions which work on Puppeteer should work here too. I need a button to be clicked. Problem is that this button is dynamically generated and its…
mike
  • 101
  • 1
  • 15
2
votes
0 answers

Pyppeteer code freezes after launching chrome

This is a web scraper on Python using pyppeteer Code part: async def process(query): print('START PROCESS') async with aiofiles.open(os.path.join(BASEDIR, config['PARSER']['Proxies']), mode='r', encoding="utf-8") as f: proxies = await…
Qwentor
  • 31
  • 1
  • 4
2
votes
1 answer

How to set cookies with pyppeteer

I know barely anything about cookies, but I need to set them in order to make my program work. Let's say I have these cookies: "fl-test-cookie-exist=Exist; fl-notice-cookie=true; country_notify=true; _svtri=6b01b3be-4fe8-4b91-8282-1613818f3329;…
Miopadrone
  • 31
  • 5
2
votes
0 answers

How to set multiple cookies to a website using pypeteer in python

I'm trying to take a screenshot using pyppeteer(python module) it works fine . But for some cases, we need to set cookies to access the given URL. code: import asyncio from pyppeteer import launch from multiprocessing import Process import…
2
votes
1 answer

Calling requests_html or pyppeteer in Python multithreading ErrorError: signal only works in main thread

I probably learned that the error was due to the use of coroutine io in pyppeteer and requests_html, which conflicted with multithreading, but I can't find a way to fix this.I don't speak English very much, I use google translation. import…
Gigi Dai
  • 21
  • 2
1 2
3
12 13