5

I was playing around with clever ways to create a python generator for sequence A003602

This appears to work, but I can't figure out why. It seems to me like it should hit infinite recursion. Is python doing some lazy evaluation somewhere that I don't recognize?

def N():
    i=1
    while True:
        yield i
        i+=1

def interleave(a,b):
    def result():
        x=a()
        y=b()
        while True:
            yield next(x)
            yield next(y)
    return result

def RZ():
    return interleave(N,RZ)()

gen=RZ()

To me it seems like since RZ instantly calls the method returned by interleave which in turn calls b which is RZ (before the first call to yield), this should be infinite recursion. But this actually seems to work. Can anyone explain why?

dspyz
  • 5,280
  • 2
  • 25
  • 63
  • it reminds me of [streams in sicp](http://mitpress.mit.edu/sicp/full-text/sicp/book/node69.html). – jfs Jun 17 '12 at 02:31

3 Answers3

4

Generators (any function with a yield statement) are lazy. This means that result() will not start processing until you request the first value from it, which you don't do.

The root cause here is that you ask for a value from x first. This means that the generator never gets to asking it's child generator until at least the second value asked for. Consider the simpler example:

def test():
    yield 1
    a = test()
    while True:
        yield next(a)

a = test()
for i in range(10):
    print(next(a))

This works, just as yours does. It has the potential for infinite recursion, but will only get that far if you ask for that many values. All you have to do is remove the yield 1 to get the expected behaviour. In yours, just switch N and RZ and ask for the next value - you will get the expected recursion.

Gareth Latty
  • 86,389
  • 17
  • 178
  • 183
  • @J.F.Sebastian Yes, because the same happens when `y=b()` is called (a new generator is set up) and then it halts when it reaches a yield. It might do a bit more stuff, but it's not going to infinitely recurse, as you are not asking for infinite values. – Gareth Latty Jun 17 '12 at 01:42
  • I thought you meant `gen=RZ()`. Until you call `next()` it does nothing (`y=b()` just returns the generator, it doesn't execute any code in it (including the code before the first `yield`). – jfs Jun 17 '12 at 01:49
  • My point is that the generator will always yield a value from `N` before getting to `RZ` again. This means it won't end up in infinite recursion, as it has to be asked for the second value at each level of recursion before it spawns a child. – Gareth Latty Jun 17 '12 at 01:51
  • 1
    yes. As long as the generator yields something before calling `next()` on itself. You can swap `yield` and `a = test()` and [it works](http://ideone.com/Emy4c). It might be easier to see that there is no recursion if the generator is defined as a class with `__init__` and `__iter__` methods. – jfs Jun 17 '12 at 02:11
  • @J.F.Sebastian I don't see how that would make anything easier. A generator does **nothing** until the first value is requested from it, I make that clear enough, which makes it quite clear why this functions as it does. – Gareth Latty Jun 17 '12 at 02:13
  • I think I understand now. I thought that when a function with yield is called, it proceeds immediately until the first yield statement, and then waits for a call to next before continuing. In other words, I wouldn't have expected your test to generate infinite recursion, but I would have expected it if you swap yield 1 with a=test() – dspyz Jun 17 '12 at 05:03
  • What happens if you have a function with both yields and a return in it? How does the interpreter decide whether to be lazy or not? Doesn't it depend which statement gets executed? – dspyz Jun 17 '12 at 05:04
4

The other answers have explained why there's not an immediate infinite recursion issue (because generators are interpreted lazily). However, I think it is also interesting to consider when you might hit the finite limits on recursion that exist in the Python interpreter.

First off, I noticed that your code can be simplified a bit (you've got a couple more functions than are really needed, and your N generator is identical to itertools.count(1)). So here's a simpler version of your generator:

from itertools import count

def RZ():
    x=count(1)
    y=RZ()
    while True:
        yield next(x)
        yield next(y)

Sample output:

>>> gen = RZ()
>>> for i in range(1, 21):
    print i, next(gen)


1 1
2 1
3 2
4 1
5 3
6 2
7 4
8 1
9 5
10 3
11 6
12 2
13 7
14 4
15 8
16 1
17 9
18 5
19 10
20 3

Next, I wrote a function that introspects into the nested generators and counts how deeply they have been evaluated. I'm not sure if this code is portable between python versions (I'm using 2.7):

def getDepth(gen):
    depth = 0
    while gen:
        depth += 1
        gen = gen.gi_frame.f_locals.get("y")
    return depth

Output from that (index, depth, value):

>>> for i in range(1, 21):
    print i, getDepth(gen), next(gen)

1 1 1
2 2 1
3 3 2
4 3 1
5 4 3
6 4 2
7 4 4
8 4 1
9 5 5
10 5 3
11 5 6
12 5 2
13 5 7
14 5 4
15 5 8
16 5 1
17 6 9
18 6 5
19 6 10
20 6 3

Those depth values are growing logarithmically. Specifically, the depth of nested generators required to produce the Nth value of the sequence is ceil(log(N, 2)) + 1.

In my copy of Python, recursion is allowed (by default) to go upto 100 levels deep. The generator will hit that limit only after 2^99 + 1 (=633,825,300,114,114,700,748,351,602,689) items have been produced. I wouldn't hold my breath waiting for that.

Blckknght
  • 100,903
  • 11
  • 120
  • 169
  • Thank you. Although as I said in my above comment. I was under the impression that things get evaluated in a slightly different order. In that case it would only take O(n) operations to go n levels deep. Also I didn't know about the itertools library. Thank you – dspyz Jun 17 '12 at 05:09
  • @Blckknght +1 for actually measuring the recursion depth and upper bounds, really nice! – Samy Vilar Jun 17 '12 at 06:20
1

To me it seems like since RZ instantly calls the method returned by interleave which in turn calls b which is RZ (before the first call to yield), calling a function that contains yield returns an iterator object it doesn't get evaluated until you start iterating, as such RZ returns an iterator as well, since it call result which has yield and as such it initially returns an iterator, ie RZ returns an iterator.

yield next(x) will call yield i and stop, calling the iterator again yield next(y) will call x = a(); y = b(); while True: yield next(x) and stop, and so forth it seems to be continuously creating generators, though not very quickly, which may or may not be good ...

count = 0
def N():
    i=1
    global count
    count += 1
    while True:
        yield i
        i+=1

def interleave(a,b):
    def result():
        global count
        count += 1
        x=a()
        y=b()
        while True:
            yield next(x)
            yield next(y)
    return result

def RZ():
    return interleave(N,RZ)()

gen = RZ() # 1) call RZ

gen = RZ()
r = [next(gen) for index in xrange(100)]
print count

it generated 14 iterator objects, after 100 iterations, 20 after 1000, 28 after 10000, 34 after 100000, so quite slow ...

Samy Vilar
  • 10,800
  • 2
  • 39
  • 34