17

Python 3.3

I've constructed this slightly cryptic piece of python 3.3:

>>> [(yield from (i, i + 1, i)) for i in range(5)]
<generator object <listcomp> at 0x0000008666D96900>
>>> list(_)
[0, 1, 0, 1, 2, 1, 2, 3, 2, 3, 4, 3, 4, 5, 4]

If I use a generator comprehension inside a list constructor, I get a different result:

>>> list((yield from (i, i + 1, i)) for i in range(5))
[0, 1, 0, None, 1, 2, 1, None, 2, 3, 2, None, 3, 4, 3, None, 4, 5, 4, None]

Why isn't the list comprehension returning a list?

Python 2.7

I can get a similarly odd effect in python 2 (using a set comprehension, because list comprehensions have odd scope):

>>> {(yield i) for i in range(5)}
<generator object <setcomp> at 0x0000000004A06120>
>>> list(_)
[0, 1, 2, 3, 4, {None}]

And when using a generator comprehension:

>>> list((yield i) for i in range(5))
[0, None, 1, None, 2, None, 3, None, 4, None]

Where'd that {None} come from?

Community
  • 1
  • 1
Eric
  • 95,302
  • 53
  • 242
  • 374
  • 3
    Inspired by the atrocity in [How does this lambda/yield/generator comprehension work?](https://stackoverflow.com/questions/15955948/how-does-this-lambda-yield-generator-comprehension-work) – Eric Jan 07 '14 at 14:11
  • You've produced a generator that generates a generator. – Lasse V. Karlsen Jan 07 '14 at 14:12
  • 1
    [This seems related](https://groups.google.com/forum/#!topic/python-ideas/JOFw5Al-kEM) but I don't grok it enough to explain it myself, yet. – kojiro Jan 07 '14 at 14:15
  • Isn't yield supposed to produce a generator, i.e. not a collection, but a structure capable of generating a collection? – Oleg Sklyar Jan 07 '14 at 14:15
  • It seems to me that it has something to do with the internals of whether python is building a function (set comp or, in python 3, any comp), or an object (list comp in python 2). – kojiro Jan 07 '14 at 14:17
  • IOW since it's a function, the `yield` expression makes it a generator. In Python 2 it's not a function, so…you just can't use `yield` at all in a listcomp. – kojiro Jan 07 '14 at 14:24
  • Good thing that using `yield` here isn't really necessary. The straight forward `[ j for i in range(5) for j in (i, i+1, i) ]` works as expected. – Alfe Jan 07 '14 at 14:26
  • See also [this question](http://stackoverflow.com/questions/12358063/use-of-yield-with-a-dict-comprehension). – DSM Jan 07 '14 at 17:06

2 Answers2

4

Using this as a reference:

Python 3 explanation

This:

values = [(yield from (i, i + 1, i)) for i in range(5)]

Translates to the following in Python 3.x:

def _tmpfunc(): 
    _tmp = [] 
    for x in range(5): 
        _tmp.append(yield from (i, i + 1, i)) 
    return _tmp 
values = _tmpfunc()

Which results in values containing a generator

That generator will then yield from each (i, i + 1, i), until finally reaching the return statement. In python 3, this will throw StopIteration(_tmp) - however, this exception is ignored by the list constructor.


On the other hand, this:

list((yield from (i, i + 1, i)) for i in range(5))

Translates to the following in Python 3.x:

def _tmpfunc():
    for x in range(5): 
        yield (yield from (i, i + 1, i))

values = list(_tmpfunc())

This time, every time the yield from completes, it evaluates to None, which is then yielded amidst the other values.

Community
  • 1
  • 1
Eric
  • 95,302
  • 53
  • 242
  • 374
  • Yeah. I'd say using `yield` in this place simply breaks the concept; the compiler should reject it as it will never lead to useful results. (But I'd be happy to be corrected.) – Alfe Jan 07 '14 at 14:29
  • 1
    Is the `_tmp` variable created and returned in the `_tmpfunc()`? Or maybe it is created outside, passed to the function, and changed in-place, to that no `return` is even necessary? – Alfe Jan 07 '14 at 14:32
  • @Alfe: Ah, I believe that may be the distinction between python 2 and 3 - 3 is definitely doing a return, because there's a `StopIteration` – Eric Jan 07 '14 at 14:33
3

List (set, dict) comprehensions translate to a different code structure from generator expressions. Let's look at a set comprehension:

def f():
    return {i for i in range(10)}

dis.dis(f.__code__.co_consts[1])
  2           0 BUILD_SET                0
              3 LOAD_FAST                0 (.0)
        >>    6 FOR_ITER                12 (to 21)
              9 STORE_FAST               1 (i)
             12 LOAD_FAST                1 (i)
             15 SET_ADD                  2
             18 JUMP_ABSOLUTE            6
        >>   21 RETURN_VALUE        

Compare to the equivalent generator expression:

def g():
    return (i for i in range(10))

dis.dis(g.__code__.co_consts[1])
  2           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                11 (to 17)
              6 STORE_FAST               1 (i)
              9 LOAD_FAST                1 (i)
             12 YIELD_VALUE         
             13 POP_TOP             
             14 JUMP_ABSOLUTE            3
        >>   17 LOAD_CONST               0 (None)
             20 RETURN_VALUE        

You'll notice that where the generator expression has a yield, the set comprehension stores a value directly into the set it is building.

This means that if you add a yield expression into the body of a generator expression, it is treated indistinguishably from the yield that the language constructs for the generator body; as a result, you get two (or more) values per iteration.

However, if you add a yield to a list (set, dict) comprehension then the comprehension is transformed from a function building a list (set, dict) into a generator that executes the yield statements then returns the constructed list (set, dict). The {None} in the set comprehension result is the set built from each of the Nones that the yield expressions evaluate to.


Finally, why does Python 3.3 not produce a {None}? (Note that previous versions of Python 3 do.) It's because of PEP 380 (a.k.a. yield from support). Prior to Python 3.3, a return in a generator is a SyntaxError: 'return' with argument inside generator; our yielding comprehensions are therefore exploiting undefined behaviour, but the actual result of the RETURN_VALUE opcode is to just generate another (final) value from the generator. In Python 3.3, return value is explicitly supported; a RETURN_VALUE opcode results in a StopIteration being raised, which has the effect of stopping the generator without producing a final value.

ecatmur
  • 152,476
  • 27
  • 293
  • 366