2

I am trying to get some data from QWebEngineView using runJavaScript function but it errors out showing the below error message.

Is there a way to solve this? Older topics suggest this is a limitation in Pyside2, so not sure if it's addressed by now.

from PySide2 import QtCore, QtWidgets, QtGui, QtWebEngineWidgets

def callbackfunction(html):
    print html

file = "myhtmlfile.html"
view = QtWebEngineWidgets.QWebEngineView()
view.load(QtCore.QUrl.fromLocalFile(file))
view.page().runJavaScript("document.getElementsByTagName('html')[0].innerHTML", callbackfunction)
TypeError: 'PySide2.QtWebEngineWidgets.QWebEnginePage.runJavaScript' called with wrong argument types:
 PySide2.QtWebEngineWidgets.QWebEnginePage.runJavaScript(str, function)
Supported signatures:
 PySide2.QtWebEngineWidgets.QWebEnginePage.runJavaScript(str)
 PySide2.QtWebEngineWidgets.QWebEnginePage.runJavaScript(str, int)
eyllanesc
  • 235,170
  • 19
  • 170
  • 241
Joan Venge
  • 315,713
  • 212
  • 479
  • 689

1 Answers1

3

PySide2 does not provide all of the overload methods of runJavaScript so it does not support passing a callback to it. A possible workaround is to use QtWebChannel that through websockets implements the communication between javascript and python:

import sys
import os

from PySide2 import QtCore, QtWidgets, QtWebEngineWidgets, QtWebChannel

CURRENT_DIR = os.path.dirname(os.path.realpath(__file__))


class Backend(QtCore.QObject):
    htmlChanged = QtCore.Signal()

    def __init__(self, parent=None):
        super(Backend, self).__init__(parent)
        self._html = ""

    @QtCore.Slot(str)
    def toHtml(self, html):
        self._html = html
        self.htmlChanged.emit()

    @property
    def html(self):
        return self._html


class WebEnginePage(QtWebEngineWidgets.QWebEnginePage):
    def __init__(self, parent=None):
        super(WebEnginePage, self).__init__(parent)
        self.loadFinished.connect(self.onLoadFinished)
        self._backend = Backend()
        self.backend.htmlChanged.connect(self.handle_htmlChanged)

    @property
    def backend(self):
        return self._backend

    @QtCore.Slot(bool)
    def onLoadFinished(self, ok):
        if ok:
            self.load_qwebchannel()
            self.load_object()

    def load_qwebchannel(self):
        file = QtCore.QFile(":/qtwebchannel/qwebchannel.js")
        if file.open(QtCore.QIODevice.ReadOnly):
            content = file.readAll()
            file.close()
            self.runJavaScript(content.data().decode())
        if self.webChannel() is None:
            channel = QtWebChannel.QWebChannel(self)
            self.setWebChannel(channel)

    def load_object(self):
        if self.webChannel() is not None:
            self.webChannel().registerObject("backend", self.backend)
            script = r"""
            new QWebChannel(qt.webChannelTransport, function (channel) {
                var backend = channel.objects.backend;
                var html = document.getElementsByTagName('html')[0].innerHTML;
                backend.toHtml(html);
            });"""
            self.runJavaScript(script)

    def handle_htmlChanged(self):
        print(self.backend.html)


if __name__ == "__main__":
    app = QtWidgets.QApplication(sys.argv)
    filename = os.path.join(CURRENT_DIR, "index.html")
    url = QtCore.QUrl.fromLocalFile(filename)
    page = WebEnginePage()
    view = QtWebEngineWidgets.QWebEngineView()
    page.load(url)
    view.setPage(page)
    view.resize(640, 480)
    view.show()
    sys.exit(app.exec_())

My previous logic focuses only on obtaining the HTML but in this part of the answer I will try to generalize the logic to be able to associate callbacks. The idea is to send the response to the bridge object associating a uuid that is related to the callback, the message must be sent in json format to be able to handle different types of data.

import json
import os
import sys

from PySide2 import QtCore, QtWidgets, QtWebEngineWidgets, QtWebChannel
from jinja2 import Template

CURRENT_DIR = os.path.dirname(os.path.realpath(__file__))


class Bridge(QtCore.QObject):
    initialized = QtCore.Signal()

    def __init__(self, parent=None):
        super().__init__(parent)
        self._callbacks = dict()

    @property
    def callbacks(self):
        return self._callbacks

    @QtCore.Slot()
    def init(self):
        self.initialized.emit()

    @QtCore.Slot(str, str)
    def send(self, uuid, data):
        res = json.loads(data)
        callback = self.callbacks.pop(uuid, None)
        if callable(callable):
            callback(res)


class WebEnginePage(QtWebEngineWidgets.QWebEnginePage):
    def __init__(self, parent=None):
        super(WebEnginePage, self).__init__(parent)
        self.loadFinished.connect(self.onLoadFinished)
        self._bridge = Bridge()

    @property
    def bridge(self):
        return self._bridge

    @QtCore.Slot(bool)
    def onLoadFinished(self, ok):
        if ok:
            self.load_qwebchannel()
            self.load_object()

    def load_qwebchannel(self):
        file = QtCore.QFile(":/qtwebchannel/qwebchannel.js")
        if file.open(QtCore.QIODevice.ReadOnly):
            content = file.readAll()
            file.close()
            self.runJavaScript(content.data().decode())
        if self.webChannel() is None:
            channel = QtWebChannel.QWebChannel(self)
            self.setWebChannel(channel)

    def load_object(self):
        if self.webChannel() is not None:
            self.webChannel().registerObject("bridge", self.bridge)
            script = r"""
            var bridge = null;
            new QWebChannel(qt.webChannelTransport, function (channel) {
                bridge = channel.objects.bridge;
                bridge.init();
            });"""
            self.runJavaScript(script)

    def execute(self, code, callback, uuid=""):
        uuid = uuid or QtCore.QUuid.createUuid().toString()
        self.bridge.callbacks[uuid] = callback
        script = Template(code).render(uuid=uuid)
        self.runJavaScript(script)


class MainWindow(QtWidgets.QMainWindow):
    def __init__(self, parent=None):
        super().__init__(parent)

        self.page = WebEnginePage()
        self.view = QtWebEngineWidgets.QWebEngineView()
        self.view.setPage(self.page)

        self.page.bridge.initialized.connect(self.handle_initialized)

        self.setCentralWidget(self.view)

        filename = os.path.join(CURRENT_DIR, "index.html")
        url = QtCore.QUrl.fromLocalFile(filename)
        self.view.load(url)

    def handle_initialized(self):
        self.page.execute(
            """
            var value = document.getElementsByTagName('html')[0].innerHTML
            bridge.send('{{uuid}}', JSON.stringify(value));
        """,
            callbackfunction,
        )


def callbackfunction(html):
    print(html)


if __name__ == "__main__":
    app = QtWidgets.QApplication(sys.argv)
    w = MainWindow()
    w.show()
    sys.exit(app.exec_())
eyllanesc
  • 235,170
  • 19
  • 170
  • 241
  • Thanks this is very useful. But I don't understand the load_object function. Is this only hard coded to get element by tag name? So I can change it to run any javascript? Even if I don't need to pass a callback function? – Joan Venge Oct 04 '20 at 18:20
  • @JoanVenge 1) I recommend you read about QtWebChannel, 2) In simple words QtWebChannel implements as a bridge between python and javascript. QtWebChannel maps the properties of a QObject and sends them to js so that it creates an object with those properties, therefore a QObject is registered as "backend" and then another object is obtained through "channel.objects.backend" but with the mapped properties, and then when using any method of the mapped object like toHtml it sends the information to the QObject method, and this is done through websockets. – eyllanesc Oct 04 '20 at 18:27
  • @JoanVenge 3) My answer answers your question specifically and does not go beyond it, do not overstate my answer. In your specific case you wanted to get the result of `document.getElementsByTagName('html')[0].innerHTML` in python, and that alone makes my answer. I am not trying to implement a generalized function. – eyllanesc Oct 04 '20 at 18:29
  • @JoanVenge 4) I think you should wait for me to answer your other question since there I will show you how to interact with javascript to manipulate properties but I am still working on polishing my solution – eyllanesc Oct 04 '20 at 18:31
  • Thanks but after your code runs, is it supposed to print the innerHtml? Because i don't get any print out so I tried page.load_object(), but it's the same. Calling page.handle_htmlChanged I get object has no attribute _html. – Joan Venge Oct 04 '20 at 18:52
  • @JoanVenge Opps, try again – eyllanesc Oct 04 '20 at 18:56
  • Ok I just tried again, now no error but no print out. Do you get a print out? I only tried the first example. – Joan Venge Oct 04 '20 at 19:14
  • @JoanVenge I have tested it. Recommendations: 1) Run the script from the console, 2) Do not modify my code: `python main.py`, 3) I use the latest versions of the libraries so if it probably does not work it is perhaps a bug(or limitation) in the version of the libraries you use. – eyllanesc Oct 04 '20 at 19:16
  • Oh ok, let me try. Also in your second example jinja2 is a non-standard library right? – Joan Venge Oct 04 '20 at 19:17
  • @JoanVenge Yes, it is not standard but it is very popular that is why I do not indicate the repository or how to install it since the documentation is easy to find. – eyllanesc Oct 04 '20 at 19:19
  • I understand, yes installation is not a problem, but just installing 3rd party libraries is tricky in studio environment. That's why I was wondering if it was non standard. – Joan Venge Oct 04 '20 at 19:28