5

I'm writing a breadth depth-first tree traversal function, and what I want to do is this:

def traverse(node):
    yield node
    for n in node.children:
        yield_all traverse(n) # << if Python had a yield_all statement

The idea is to end up with a (flat) sequence of nodes in the tree.

Approach #1: (propagating yields)

def traverse(node):
    yield node
    for n in node.children:
        for m in traverse(n):
            yield m

Approach #2: (flattening sequences)

def traverse(node):
    return itertools.chain([node],*(traverse(n) for n in node.children))

The first approach seems more clean, but I feel weird explicitly yielding each node in the subtree at each level.

The second approach is terse and slightly dirty, but it matches what I would write in Haskell:

traverse node = node : concatMap traverse (children node)

So my question is: Which is better? Or am I missing a best 3rd option?

Matt Fenwick
  • 48,199
  • 22
  • 128
  • 192
perimosocordiae
  • 17,287
  • 14
  • 60
  • 76
  • List comprehensions would make this cleaner. – Rafe Kettler Dec 14 '10 at 21:49
  • Rafe: Write an answer and show me! :-) – perimosocordiae Dec 14 '10 at 21:52
  • 1
    I'd like to see a list comprehension for this... you'd need to flatten it in the end, right? As far as I am concerned, the `chain` solution is wonderful. –  Dec 14 '10 at 21:56
  • Approach #2 does not work. You would get `TypeError: type object argument after * must be a sequence, not generator`. – unutbu Dec 14 '10 at 22:14
  • #2 will not work though .. it chains iterators, `node` is't one. – Jochen Ritzel Dec 14 '10 at 22:19
  • @ubuntu: You can do it in 2.7. I think they added it in that version. – Jochen Ritzel Dec 14 '10 at 22:21
  • @THC4k, tested in python2.7, and I think @unutbu is right: it should look itertools.chain([node], ...). I have a question here, the *(...) does not break the lazyness of traverse()? – tokland Dec 14 '10 at 22:24
  • I agree with delnan that the chain method is probably the best, though you need to make the code be `return itertools.chain([node],*(traverse(n) for n in node.children))` to get it to work (and use list() on traverse(headnode)). – Justin Peel Dec 14 '10 at 22:28
  • However, the yield method is about 3x faster than the chain method.. and quite readable so I'd go with that. – Justin Peel Dec 14 '10 at 22:47
  • 2
    Isn't this depth-first traversal of the tree? Weren't you asking for breadth-first? – Hugh Bothwell Dec 14 '10 at 23:27

4 Answers4

5

[UPDATE] See PEP-380, this yield all syntax is available starting from Python 3.3 as yield from:

def traverse(node):
    yield node
    for n in node.children:
        yield from traverse(n)
tokland
  • 66,169
  • 13
  • 144
  • 170
  • Ah, sometimes it seems everything I wish Python could be is buried in a PEP somewhere. I should start collecting pep modules, or perhaps just learn to accept that Python isn't *really* a functional language. – perimosocordiae Dec 15 '10 at 04:42
  • 1
    @perimosocordiae: Knowing GvR's opinions on FP I would't expect too much on this angle. But definitely Python would be a sadder language without the FP-related work of Kuchling and Hettinger (and others). – tokland Dec 15 '10 at 11:03
3

I'd go with first. You'll get over propagating yields after a couple of times. :-)

Lennart Regebro
  • 167,292
  • 41
  • 224
  • 251
1

This is an opinions question, so all the answers will just be value judgments. As far as I can think there's no elegant third way, though.

My opinion is that the first way wins hands down. It's clearer and easier to read -- Python isn't Haskell, even though it can do some functional stuff, and often the functional approach just doesn't look as neat.

Katriel
  • 120,462
  • 19
  • 136
  • 170
0

Traversing with node position:

def iter_tree(t, i=0, j=0):
    yield (i, j), t
    for j, n in enumerate(t.children):
        yield from iter_tree(n, i + 1, j)

for (i, j), n in iter_tree(t):
    print(i*'    ', (i, j), n)
APEFind
  • 1
  • 1