XPath 1.0 does not support regular expressions, i.e. the function starts-with
does not support regular expressions.
Lxml does not support XPath 2.0. You have the following three options:
Switch to a processor who is able to handle XPath 2.0. You can then use the fn:matches() function.
Use a XPath 1.0 compliant solution. This is rather ugly, but it works and may in some circumstances be the easiest solution. However, this is not a general solution! It will replace the numbers in @id
with a -
and match against this. So this would also deliver true if the original id
was something like post--
. Use a character which you know will not occur at this position.
tree.xpath("//div[starts-with(translate(@id, '0123456789', '----------'), 'post--')]")
- lxml supports the EXSLT namespaces and you can use the regex functions from there. In my opinion this is the best solution.
regexpNS = "http://exslt.org/regular-expressions"
r = tree.xpath("//div[re:test(@id, '^post-[0-9]')]", namespaces={'re': regexpNS})