12

<td></td><td>foo</td>

I would like to return ['', 'foo'] but libxml's xpath //td/text() returns just ['foo']. How do I find the empty tag as '' instead of (not matched)?

joeforker
  • 40,459
  • 37
  • 151
  • 246

2 Answers2

8

While @Tomalak is perfectly right, in XPath 2.0 one can use:

//td/string(.)

and this produces a sequence of strings -- each one containing the string value of a corresponding td element.

So, in your case the result will be the desired one:

"", "foo"

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • 1
    +1. This does exactly what my solution does for XPath 1.0 - it takes the `` nodes and then uses their respective text value. – Tomalak Mar 11 '10 at 19:23
7

As long as you are selecting text nodes specifically, you can't. Because there simply is no text node in the first <td>.

When you change your XPath expression to '//td', you get the two <td> nodes. Use their text value in further processing.

Tomalak
  • 332,285
  • 67
  • 532
  • 628
  • 1
    I wound up finding all the `td` nodes and calling .text on them. Not as cool as doing everything in one big XPath ;-) but it works. – joeforker Mar 11 '10 at 19:45
  • @joeforker: As long as you don't have access to the all-shiny XPath 2.0, that's your only option. :-) – Tomalak Mar 11 '10 at 19:56