3

I'm trying to move from Selenium to Playwright for some webscraping tasks.

Perhaps I got stuck into this bad habit of having Selenium running the browser on the side while testing the commands and selectors on the run.

Is there any way to achieve something similar using Playwright?

What I achieved so far was running playwright on the console, something similar to this:

from playwright.sync_api import sync_playwright
with sync_playwright() as pw:
    browser = pw.chromium.launch(headless=False)
    page = browser.new_page()
    page.goto('https://google.com')
    page.pause()

I get a Browser window together with a Playwright Inspector - from there I none of my commands or variables will execute.

ggorlen
  • 44,755
  • 7
  • 76
  • 106
LdL
  • 35
  • 3
  • This code should work in interactive mode. Just set timeouts to 0. Or. if you really need a script and you don't need security, `while True: eval(input(">>> "))`? – ggorlen Nov 21 '22 at 15:24
  • I'm using the Python Console on PyCharm, after the page.pause() no commands take effect. If I add the command to the block say: `page.get_by_role("button", name="Reject all").click()` It will work after restarting the browser – LdL Nov 21 '22 at 15:54
  • Yep, it kinda works. Unfortunately I cannot assign results to any variables, or if I get exceptions it will just close the event loop. – LdL Nov 21 '22 at 17:02
  • After `page.pause()`, did you try calling `playwright.resume()` from the browser dev tools console? – ggorlen Nov 21 '22 at 18:51

1 Answers1

2

I'd use the technique from can i run playwright outside of 'with'? and How to start playwright outside 'with' without context managers on the interactive repl:

PS C:\Users\foo\Desktop> py
Python 3.10.2 (tags/v3.10.2:a58ebcc, Jan 17 2022, 14:12:15) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from playwright.sync_api import sync_playwright
>>> p = sync_playwright().start()
>>> browser = p.chromium.launch(headless=False)
>>> page = browser.new_page()
>>> page.goto("https://www.example.com")
<Response url='https://www.example.com/' request=<Request url='https://www.example.com/' method='GET'>>
>>> page.title()
'Example Domain'
>>> page.close()
>>> browser.close()
>>> p.stop()

If you use page.pause(), try running playwright.resume() in the browser dev tools console to resume the Python repl.

If you really need to do this from a script rather than the Python repl, you could use the code interpreter or roll your own, but I'd try to avoid this if possible.

ggorlen
  • 44,755
  • 7
  • 76
  • 106