is there any way to load url and to manipulate the page dom without rendering the page i like to do it problematically without showing the page it self in the browser
Asked
Active
Viewed 1,653 times
1 Answers
4
I believe you should be able to load the web page using QNetworkAccessManager and manipulate its content using QTextDocument; below is a small example. Also you can use QWebPage class without showing the page contents. I also included it into the example below:
void MainWindow::on_pushButton_clicked()
{
// load web page
QNetworkAccessManager *manager = new QNetworkAccessManager(this);
connect(manager, SIGNAL(finished(QNetworkReply*)), this, SLOT(replyFinished(QNetworkReply*)));
manager->get(QNetworkRequest(QUrl("http://www.google.com")));
}
void MainWindow::replyFinished(QNetworkReply* reply)
{
QByteArray content = reply->readAll();
// process network reply using QTextDocument
QTextDocument page;
page.setHtml(content);
for (QTextBlock block = page.begin(); block != page.end(); block = block.next())
{
// do smth here
qDebug() << block.text();
}
// process network reply using QWebPage
QWebPage webPage;
webPage.mainFrame()->setHtml(content);
QWebElement document = webPage.mainFrame()->documentElement();
QWebElementCollection elements = document.findAll("element_name");
foreach (QWebElement element, elements)
{
// do smth here
qDebug() << element.toPlainText();
}
}
hope this helps, regards

serge_gubenko
- 20,186
- 2
- 61
- 64
-
Bumping a year old answer ... How would you go about parsing this without QWebPage? To be more precise, what if this code was not in the main thread where you can't create QWebPage? – liliumdev Aug 01 '11 at 23:34