1

For (any) given XML, I'm trying to generate a list of all the possible XPaths that are built from tag names only (no attrs etc.).

It is enough to get xpaths pointing to leafs.

EXAMPLE:

      A
     / \
    X   Y
   /|\
  1 1 2
       \
        Q   

returns

/A/X/1
/A/X/1
/A/X/2/Q
/A/Y

or better (unique)

/A/X/1
/A/X/2/Q
/A/Y

Do you have any idea for relatively efficient function (probably recursive)?

EDIT:

I came up with this but I'm afraid the complexity is pretty big

def get_leaf_xpaths(root):
    tree = root.getroottree()
    xpaths = set()
    height = get_tree_height(root)
    for i in range(height):
        xpath = '/*' + '/*'*i
        elements = root.xpath(xpath)
        for e in elements:
            if not e.getchildren():
                elem_xpath = tree.getpath(e)
                elem_xpath = re.sub('\[[0-9]+\]','',elem_xpath)
                xpaths.add(elem_xpath)
    return xpaths
Milano
  • 18,048
  • 37
  • 153
  • 353
  • 1
    Why was this is closed as duplicate? This is clearly a different question, especially with regard to efficiency. – nwellnhof Mar 17 '21 at 09:22

0 Answers0