1

I am looking for a code to copy the generator and then continue with the new generator. It is like a bifurcation of a generator.

def Generator():
    myNumbers=range(3)
    for i in myNumbers:
        yield i

for i in Generator():
    bifurcatedGenerator = Generator
    for j in bifurcatedGenerator():
        print (i, j)

this code gives as output:

0 0
0 1
0 2
1 0
1 1
1 2 <- wrong
2 0
2 1 <- wrong
2 2 <- wrong

whereas the disiered output should be: (The bifurcated generator needs to be a new instance, but continue at the same point as the old generator stopped.)

0 0
0 1
0 2
1 1
1 2
2 2

The application itself is much more complicated, this here is just a code example.

Important (only for myself) is a semanticly beautiful solution which is nicely readable to third parties.Efficiency is not so important

Marcel Sonderegger
  • 772
  • 1
  • 8
  • 21
  • Possible duplicate of [deep-copying a generator in python](https://stackoverflow.com/questions/21315207/deep-copying-a-generator-in-python) – internet_user Jun 07 '18 at 17:03

2 Answers2

2

Why not use a generator with a start parameter (and a stop one while you are at it)?

def Generator(start=0, stop=3):
    for i in range(start, stop):
        yield i

for i in Generator():
    for j in Generator(start=i):
        print (i, j)

Also gives the output:

0 0
0 1
0 2
1 1
1 2
2 2
Marcel Sonderegger
  • 772
  • 1
  • 8
  • 21
berna1111
  • 1,811
  • 1
  • 18
  • 23
1

Some people will tell you to use itertools.tee. Do not use itertools.tee.

Use a list

To keep track of the previous states of your generator, you need to store previously yielded values in a list. This is what the function itertools.tee does when it copies a generator.

Unfortunately, this removes all memory-advantage of using a generator. So you are better to use a list.

def generator():
    yield from range(3)

lst = list(generator())

for i in range(len(lst)):
    for j in range(i, len(lst)):
        print(lst[i], lst[j])

Output:

0 0
0 1
0 2
1 1
1 2
2 2

Why not using itertools.tee then?

It is still possible to use itertools.tee, but you should not.

from itertools import tee

def generator():
    yield from range(3)

lst = list(generator())

main_gen, bif_gen = tee(generator())

for i in main_gen:
    for j in bif_gen:
        print(i, j)
    _, bif_gen = tee(main_gen) # Yes, you *must* use the second item here

The reason the previous code works is subtle and is actually linked to the fact that itertools.tee returns the same tee object as first output value when given a tee object. This is why the second generator should be used.

This, coupled to the fact that the doc explicitly specifies that a list is better in this situation, demonstrates that the first solution must be preferred:

This itertool may require significant auxiliary storage (depending on how much temporary data needs to be stored). In general, if one iterator uses most or all of the data before another iterator starts, it is faster to use list() instead of tee().

Olivier Melançon
  • 21,584
  • 4
  • 41
  • 73
  • Rather than relying on the completely undocumented fact that teeing a tee reuses the underlying tee structure of the input, you could use the barely documented fact that tees are copyable with `copy.copy`. – user2357112 Jun 07 '18 at 17:34
  • (The only documentation that tees are copyable is in the `tee_lookahead` example in the Python 2 docs, at the end of the [itertools recipes](https://docs.python.org/2/library/itertools.html#recipes).) – user2357112 Jun 07 '18 at 17:35
  • I am looking for a beautiful solution (efficency is not so important). So I like your itertools.tee solution. Your line lst = list(generator()) is not necessary. – Marcel Sonderegger Jun 07 '18 at 17:36
  • @user2357112 The whole point is that it is documented that tee *should not* be used here – Olivier Melançon Jun 07 '18 at 17:36
  • @MarcelSonderegger No, the whole point here is that you *must not* use tee, it could backfire – Olivier Melançon Jun 07 '18 at 17:37
  • @OlivierMelançon: Depending on whether the actual use case completely exhausts the bifurcated generators, `tee` could be appropriate. The example in the question exhausts them, but question code is frequently unrepresentative. – user2357112 Jun 07 '18 at 17:39
  • @user2357112 At first I wanted to show that it does not work. Unfortunately, tee is pretty smart and so it did for undocumented reasons. That is why I prefer to explain why it works but shouldn't be used, otherwise some people might come up with that solution and think it is ok to use it in general. – Olivier Melançon Jun 07 '18 at 17:39
  • @MarcelSonderegger Please tell me you are not using the tee non-solution. It is longer in lines, less-efficient in time, there are documented reasons not to do it and I specifically said I showed it so you knew not to use it when people would give you that solution. It's like I show you a lion so you avoid it and you decide to have it as a pet. – Olivier Melançon Jun 07 '18 at 17:52
  • I can agree that the 'tee' solution should not be used, but the list solution is IMHO somehow ugly. Looking up the itertools.tee function, then we need to agree, that it is really clever. – Marcel Sonderegger Jun 07 '18 at 18:11
  • @MarcelSonderegger If you do not list using `list` you can use `[x for x in generator()]` instead – Olivier Melançon Jun 07 '18 at 18:18
  • Finaly I give up, as this seems to be a open topic since 2009: PEP 380 -- Syntax for Delegating to a Subgenerator – Marcel Sonderegger Jun 07 '18 at 18:31
  • @MarcelSonderegger What you are reading is about delegating to another generator, that is the yield from syntax. This is not related and is a accepted and released PEP. – Olivier Melançon Jun 07 '18 at 18:36