0

I am traversing a DOM using Qt's WebKit classes. Please have a look on the following pseudo HTML:

<br>111<a class="node">AAA</a>
<br>222<a class="node">BBB</a>
...

I can easily find the anchors using findAll(). However I also need to get the text before the elements ("111" and "222"). I tried to use previousSibling() but of course that gives me the <br> element since the "111" and "222" texts are no elements.

I found a function to access text within an element, but how can I access between the <br> and the <a> elements?

Silicomancer
  • 8,604
  • 10
  • 63
  • 130
  • Can you modify this HTML ? – Rud Limaverde Apr 02 '14 at 17:00
  • No, I can't modify it. – Silicomancer Apr 02 '14 at 17:42
  • So if you access the parent (not sibling), and iterate through all text under it, it gives you... what? `"111\n222"` or what? – hyde Apr 02 '14 at 18:41
  • Using anchorParent.toPlainText() (I suppose this is what you mean) gives my all text of the parent and of all its descendants (111 AAA 222 BBB). Maybe I could analyse that chunk of text in some way but that would be kind of absurd having a DOM that would be much easier to analyse. – Silicomancer Apr 02 '14 at 19:32

2 Answers2

0

It seems it is not possible. The only workaround I could find is getting the plain text of the parent node and parsing the resulting plain text.

Silicomancer
  • 8,604
  • 10
  • 63
  • 130
0

This is the way I solved it:

QWebElement *element = ...

// find out if QWebElement has text
QDomDocument doc;
doc.setContent(element->toOuterXml());
QDomElement domelem = doc.documentElement();
for(QDomNode n = domelem.firstChild(); !n.isNull(); n = n.nextSibling())
{
    QDomText t = n.toText();
    if (!t.isNull())
    {
        // it has text !
        qDebug() << t.data();
        break;
    }
}