Is it possible for xpath to return NULL if there is no text data?

Question

I am currently trying to extract all data from a table. Table data rows are formatted as <td headers="h1" align="left"></td> when there is no data.

Using the etree.tostring() method from the lxml library prints out these elements as <td headers="h1" align="left"/> instead of the source formatting.

Furthermore, using xpath if I run the code tree.path('//td[@headers="h1"]/text()') the resulting list does not include blank values where there is no data.

As I am trying to write these results to a CSV file, how do I include NULL, i.e. "" when there is no data?

score 2 · Accepted Answer · edited May 23 '17 at 12:07

2

One workaround would be to use //td[@headers="h1"] xpath to get the elements and then get the .text property on each:

from lxml import etree

data = """
<table>
    <tr>
        <td headers="h1" align="left"></td>
        <td headers="h1" align="left">Text1</td>
        <td headers="h1" align="left"/>
        <td headers="h1" align="left">Text2</td>
        <td headers="h1" align="left"></td>
    </tr>
</table>
"""

tree = etree.fromstring(data)
print [element.text for element in tree.xpath('//td[@headers="h1"]')]

Prints:

[None, 'Text1', None, 'Text2', None]

If you want empty string instead of None:

print [element.text if element.text is not None else ''
       for element in tree.xpath('//td[@headers="h1"]')]

would print:

['', 'Text1', '', 'Text2', '']

Also see: How do I return '' for an empty node's text() in XPath?

edited May 23 '17 at 12:07

Community

1
1

answered Jun 01 '14 at 00:28

alecxe

462,703
120
1,088
1,195

I was hoping that there was a way to do it using xpath, rather than python. – toolshed Jun 01 '14 at 00:46
@Addikt no way, only in xpath 2.0. – alecxe Jun 01 '14 at 00:47
Well, this will have to do then. Thanks. – toolshed Jun 01 '14 at 01:49

Is it possible for xpath to return NULL if there is no text data?

1 Answers1