4

So, using PyQt5's QWebEngineView and the .setHTML and .setContent methods have a 2 MB size limitation. When googling for solutions around this, I found two methods:

Use SimpleHTTPServer to serve the file. This however gets nuked by a firewall employed in the company.

Use File Urls and point to local files. This however is a rather bad solution, as the HTML contains confidential data and I can't leave it on the harddrive, under any circumstance.

The best solution I currently see is to use file urls, and get rid of the file on program exit/when loadCompleted reports it is done, whichever comes first.

This is however not a great solution and I wanted to ask if there is a solution I'm overlooking that would be better?

shao.lo
  • 4,387
  • 2
  • 33
  • 47
Berserker
  • 1,112
  • 10
  • 26
  • You just have to open a dynamic port (>=1024), that should not be prohibited by the firewall. – SteakOverflow Feb 22 '18 at 12:08
  • Also the 2MB limit is documented only for setHtml, but not for setContent. Did you actually try setContent? – SteakOverflow Feb 22 '18 at 12:13
  • The 2MB Limit is documented for setContent as setHTML says it is a shorthand for setContent. And yes, I tried and I get blank page with load result = "failed" – Berserker Feb 22 '18 at 12:50

2 Answers2

4

Why don't you load/link most of the content through a custom url scheme handler?

webEngineView->page()->profile()->installUrlSchemeHandler("app", new UrlSchemeHandler(e));

class UrlSchemeHandler : public QWebEngineUrlSchemeHandler
{   Q_OBJECT
public:
    void requestStarted(QWebEngineUrlRequestJob *request) {
        QUrl url = request->requestUrl();
        QString filePath = url.path().mid(1);
        // get the data for this url
        QByteArray data = ..
        // 
        if (!data.isEmpty()) 
        {
            QMimeDatabase db;
            QString contentType = db.mimeTypeForFileNameAndData(filePath,data).name();
            QBuffer *buffer = new QBuffer();
            buffer->open(QIODevice::WriteOnly);
            buffer->write(data);
            buffer->close();
            connect(request, SIGNAL(destroyed()), buffer, SLOT(deleteLater()));
            request->reply(contentType.toUtf8(), buffer);
        } else {
            request->fail(QWebEngineUrlRequestJob::UrlNotFound);
        }
    }
};

you can then load a website by webEngineView->load(new QUrl("app://start.html"));

All relative pathes from inside will also be forwarded to your UrlSchemeHandler..

And rember to add the respective includes

#include <QWebEngineUrlRequestJob>
#include <QWebEngineUrlSchemeHandler>
#include <QBuffer>
Cpp Forever
  • 900
  • 8
  • 16
Johannes Munk
  • 305
  • 2
  • 8
  • Mh. I don't know how to do that in PyQt5, but that looks worth investigating. – Berserker Feb 22 '18 at 17:45
  • @Berserker. I tried it in pyqt5, and I can confirm that it works as expected. – ekhumoro Feb 22 '18 at 20:24
  • 1
    There is an example in the [qutebrowser](https://github.com/qutebrowser/qutebrowser/blob/master/qutebrowser/browser/webengine/webenginequtescheme.py). It seems to solve the lifetime [issue](https://riverbankcomputing.com/pipermail/pyqt/2016-September/038076.html) of the buffer. – Johannes Munk Feb 22 '18 at 22:43
  • I got this working in my environment, it does surpass the 2MB limit and pulls the data directly. Thank you!. – Berserker Feb 23 '18 at 12:07
  • Gotta retract a bit, unicode is not working - I'm responsing with "job.reply("text/html".encode(), buf)" and buf is a QIODevice, which contains utf8-string.encode(). However, the result has smashed encoding; "ä" becomes "ä" for example. – Berserker Feb 23 '18 at 16:57
  • And got it working. Somewhat confusing though. I had to define , which I didn't need to before the switch to this system, but now it works. – Berserker Feb 23 '18 at 17:06
2

One way you can go around this is to use requests and QWebEnginePage's method runJavaScript:

web_engine = QWebEngineView()
web_page = web_engine.page()
web_page.setHtml('')

url = 'https://youtube.com'
page_content = requests.get(url).text

# document.write writes a string of text to a document stream
# https://developer.mozilla.org/en-US/docs/Web/API/Document/write
# And backtick symbol(``) is for multiline strings
web_page.runJavaScript('document.write(`{}`);'.format(page_content))
sup
  • 323
  • 1
  • 12