For (any) given XML, I'm trying to generate a list of all the possible XPath
s that are built from tag names only (no attrs etc.).
It is enough to get xpaths pointing to leafs.
EXAMPLE:
A
/ \
X Y
/|\
1 1 2
\
Q
returns
/A/X/1
/A/X/1
/A/X/2/Q
/A/Y
or better (unique)
/A/X/1
/A/X/2/Q
/A/Y
Do you have any idea for relatively efficient function (probably recursive)?
EDIT:
I came up with this but I'm afraid the complexity is pretty big
def get_leaf_xpaths(root):
tree = root.getroottree()
xpaths = set()
height = get_tree_height(root)
for i in range(height):
xpath = '/*' + '/*'*i
elements = root.xpath(xpath)
for e in elements:
if not e.getchildren():
elem_xpath = tree.getpath(e)
elem_xpath = re.sub('\[[0-9]+\]','',elem_xpath)
xpaths.add(elem_xpath)
return xpaths