I wrote code in PyQt4 that scrapes a website and its inner frames.
import sys, signal
from PyQt4 import QtGui, QtCore, QtWebKit
class Sp():
def save(self, ok, frame=None):
if frame is None:
print ('main-frame')
frame = self.webView.page().mainFrame()
else:
print('child-frame')
print('URL: %s' % frame.baseUrl().toString())
print('METADATA: %s' % frame.metaData())
print('TAG: %s' % frame.documentElement().tagName())
print('HTML: ' + frame.toHtml())
print()
def handleFrameCreated(self, frame):
frame.loadFinished.connect(lambda: self.save(True, frame=frame))
def main(self):
self.webView = QtWebKit.QWebView()
self.webView.page().frameCreated.connect(self.handleFrameCreated)
self.webView.page().mainFrame().loadFinished.connect(self.save)
self.webView.load(QtCore.QUrl("http://www.w3schools.com/tags/tryit.asp?filename=tryhtml_iframe_scrolling"))
signal.signal(signal.SIGINT, signal.SIG_DFL)
print('Press Crtl+C to quit\n')
app = QtGui.QApplication(sys.argv)
s = Sp()
s.main()
sys.exit(app.exec_())
This code depends on creating an instance of QApplication and exiting it accordingly.
The problem with this is that QApplication must be created and exited in the main thread.
I don't have access to the main thread in the project that i'm developing.
Is it possible to avoid the error “QApplication was not created in main() thread” in some way?
Maybe by rewriting the code for it to work without QApplication or somehow make QApplication work without the main thread?
Edit: I can edit the main thread if it doesn't intervene with its flow of the execution of its code, for example app = QtGui.QApplication([])
wouldn't stop the flow but a function that hangs until some code in another thread would finish would be considered inapplicable.