0

I am using python 3.7 and I want to print all divs with id's that start with "def". I don't undestand why my example code is not working:

import xml.etree.ElementTree as ET
html = '<div> <div id="abc 123"></div> <div id="def hhh"></div> <div id="ghi test"></div> </div>'
root = ET.fromstring(html)
print( root.findall("//div[starts-with(@id, 'def')]") )

Result:

SyntaxError: cannot use absolute path on element

I really don't undestand why this is not working. If that would work the next step would be to loop through it and get the id names using .get("id").

What am I doing wrong?

Woods
  • 53
  • 3
  • Does this answer your question? [Python - ElementTree- cannot use absolute path on element](https://stackoverflow.com/questions/5501118/python-elementtree-cannot-use-absolute-path-on-element) – Sorin Feb 16 '20 at 22:16
  • Actually no. // or .// does not work – Woods Feb 16 '20 at 22:44

1 Answers1

2

There are 2 distinct issues here, both handled by other questions.

First:

import xml.etree.ElementTree as ET
html = '<div> <div id="abc 123"></div> <div id="def hhh"></div> <div id="ghi test"></div> </div>'
root = ET.fromstring(html)

try:
    print( root.findall("//div[starts-with(@id, 'def')]") )
except SyntaxError as e:
    print e

See live: https://ideone.com/PfpjYz Error is "cannot use absolute path on element" - this is handled by Python - ElementTree- cannot use absolute path on element

Second issue, when using relative elements, you get a different error:

import xml.etree.ElementTree as ET
html = '<div> <div id="abc 123"></div> <div id="def hhh"></div> <div id="ghi test"></div> </div>'
root = ET.fromstring(html)
try:
    print( root.findall(".//div[starts-with(@id, 'def')]") )
except SyntaxError as e:
    print e

See live: https://ideone.com/BNc7My

Error has changed "invalid predicate" - see xpath-support - starts-with is not there

See: Python XPath SyntaxError: invalid predicate for possible solutions.

Finally, you can use a different xml library, or work around the limitations of this one.

import lxml.etree
root = lxml.etree.fromstring(html)

print root.xpath(".//div[starts-with(@id, 'def')]")[0].attrib

See online: https://ideone.com/Ie1j8F

Note: this is still a duplicate question, and should be closed as such, but this is to long to fit into an comment.

Sorin
  • 5,201
  • 2
  • 18
  • 45