Python: why is one generator faster than other with `yield` inside?

Question

I have two functions returning generator:

def f1():
    return (i for i in range(1000))

def f2():
    return ((yield i) for i in range(1000))

Apparently, generator returned from f2() is twice as slower than f1():

Python 3.6.5 (default, Apr  1 2018, 05:46:30) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit, dis
>>> timeit.timeit("list(f1())", globals=globals(), number=1000)
0.057948426001530606
>>> timeit.timeit("list(f2())", globals=globals(), number=1000)
0.09769760200288147

I tried to using dis to see what's going on but to no avail:

>>> dis.dis(f1)
  2           0 LOAD_CONST               1 (<code object <genexpr> at 0x7ffff7ec6d20, file "<stdin>", line 2>)
              2 LOAD_CONST               2 ('f1.<locals>.<genexpr>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               3 (1000)
             10 CALL_FUNCTION            1
             12 GET_ITER
             14 CALL_FUNCTION            1
             16 RETURN_VALUE
>>> dis.dis(f2)
  2           0 LOAD_CONST               1 (<code object <genexpr> at 0x7ffff67a25d0, file "<stdin>", line 2>)
              2 LOAD_CONST               2 ('f2.<locals>.<genexpr>')
              4 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               3 (1000)
             10 CALL_FUNCTION            1
             12 GET_ITER
             14 CALL_FUNCTION            1
             16 RETURN_VALUE

Apparently, the results from dis are the same.

So why is generator returned from f1() faster than generator from f2()? And what is proper way to debug this? Apparently dis in this case fails.

EDIT 1:

Using next() instead of list() in timeit reverses the results (or they are the same in some cases):

>>> timeit.timeit("next(f1())", globals=globals(), number=10**6)
1.0030477920008707
>>> timeit.timeit("next(f2())", globals=globals(), number=10**6)
0.9416838550023385

EDIT 2:

Apparently it's bug in Python, fixed in 3.8. See yield in list comprehensions and generator expressions

Generator with yield inside actually yields two values.

doing `next(f1())` instead of `list(f1())` for instance [reverses the results](https://repl.it/repls/DenseTalkativeApplicationprogrammer). It is not only how the `f()`s are setup, but also what you do with them. — Ma0, Jul 11 '18 at 09:02
@Ev.Kounis yes, doing `timeit.timeit("next(f1())", globals=globals(), number=10**7)` is slower than using f2(). Interesting... — Andrej Kesely, Jul 11 '18 at 09:06
Because `(yield i)` is converted to `YIELD_VALUE` opcode and parsing the syntax plus pushing/poping that opcode to stack will take more time. And the reason you can't see that is because generator-expressions are converted to generator functions and `dis` doesn't give a nested disassembled representation of the bytecodes. — Mazdak, Jul 11 '18 at 09:08

score 4 · Accepted Answer · answered Jul 11 '18 at 09:10

Yield in generator expressions is actually a bug, as discussed in this related question.

If you want to actually see what's going on with dis, you need to introspect on the code object's co_const[0], so:

>>> dis.dis(f1.__code__.co_consts[1])
  2           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                11 (to 17)
              6 STORE_FAST               1 (i)
              9 LOAD_FAST                1 (i)
             12 YIELD_VALUE
             13 POP_TOP
             14 JUMP_ABSOLUTE            3
        >>   17 LOAD_CONST               0 (None)
             20 RETURN_VALUE
>>> dis.dis(f2.__code__.co_consts[1])
  2           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                12 (to 18)
              6 STORE_FAST               1 (i)
              9 LOAD_FAST                1 (i)
             12 YIELD_VALUE
             13 YIELD_VALUE
             14 POP_TOP
             15 JUMP_ABSOLUTE            3
        >>   18 LOAD_CONST               0 (None)
             21 RETURN_VALUE

So, it yields twice.

I would not call it a bug. Doc. from Python 3.6 says at *6.2.9. Yield expressions* that it is the expected behaviour: *The value of the yield expression after resuming depends on the method which resumed the execution. If `__next__()` is used (typically via either a `for` or the `next()` builtin) then the result is `None`...*. As it is assumed not to be what the *progammer* expected, `yield(yield ...)` is deprecated in 3.7 and a syntax error in 3.8. But it is not a Python bug (at most a program one). — Serge Ballesta, Jul 11 '18 at 09:35

score 2 · Answer 2 · answered Jul 11 '18 at 09:09

2

Maybe it is because the generator returned by f2 returns twice the number of elements.

Just look what happens:

>>> def f2():
    return ((yield i) for i in range(10))
>>> g = f2()
>>> print([i for i in g])
[0, None, 1, None, 2, None, 3, None, 4, None, 5, None, 6, None, 7, None, 8, None, 9, None]

Using yield in a generator returns a None element after each actual item.

answered Jul 11 '18 at 09:09

Serge Ballesta

143,923
11
122
252

Is this a bug or feature? – Andrej Kesely Jul 11 '18 at 09:13
@AndrejKesely this is a bug, that is going to be fixed – juanpa.arrivillaga Jul 11 '18 at 09:14

score 0 · Answer 3 · answered Jul 11 '18 at 09:10

Your formulation of f2() is the first thing that stroke me as unusual. Writing f2() the way most people would yields (pun intended) very different results:

import timeit

def f1():
    return (i for i in range(1000))

def f2():
    for i in range(1000):
        yield i

res1 = timeit.timeit("list(f1())", globals=globals(), number=1000)
print(res1)  # 0.05318646361085916
res2 = timeit.timeit("list(f2())", globals=globals(), number=1000)
print(res2)  # 0.05284952304875785

So the two seem to be equally fast.

As others say in their answers, this probably has to do with the fact that your f2() returns twice as many elements.

Python: why is one generator faster than other with `yield` inside?

3 Answers3