2

Is it possible to parse HTML (HTML/HTML5, not XHTML) pages using XPath and Qt classes of QWebkit (and probably other standard or Qt classes) without using any external utilities like tidy?

Thanks a lot!

Zarathustra
  • 115
  • 2
  • 8

1 Answers1

2

No, obviously. XPath is for well-formed XML, which HTML is not (besides XHTML that you specifically excluded).

For accessing the DOM tree of a QtWebkit page, you have to use QtWebkit's QWebElement api.

You can access the document element with

QWebElement QWebView::page()->mainFrame()->documentElement();
Chris Browet
  • 4,156
  • 20
  • 17