21

I'm using a generator function, say:

def foo():
    i=0
    while (i<10):
         i+=1
         yield i

Now, I would like the option to copy the generator after any number of iterations, so that the new copy will retain the internal state (will have the same 'i' in the example) but will now be independent from the original (i.e. iterating over the copy should not change the original).

I've tried using copy.deepcopy but I get the error:

 "TypeError: object.__new__(generator) is not safe, use generator.__new__()"   

Obviously, I could solve this using regular functions with counters for example. But I'm really looking for a solution using generators.

Kyle Strand
  • 15,941
  • 8
  • 72
  • 167
Cain
  • 365
  • 2
  • 8
  • 6
    I don't *think* it's possible. If all you need is two iterations over the results, then read to a `list` and iterate the `list` multiple times, or `itertools.tee` might help. It doesn't copy the generator, though, it just stores up results in a queue and spits them out again later. So any side-effects your function has won't be executed again when you read the "copy" (which presumably is what you'd want from a true clone of a generator), and if you read from the "copies" it *will* advance the underlying generator -- once you've teed you basically need all readers to use tees, not the original. – Steve Jessop Jan 23 '14 at 17:27
  • What is the problem you really want to solve? There are multiple possible answers. – Corley Brigman Jan 23 '14 at 17:30
  • 1
    You might know this already, but `Define` is not legal python. The correct keyword is `def` – SethMMorton Jan 23 '14 at 18:16
  • Nope, still wrong after the edit. Case is important in Python. `Def` is not legal either. Just `def`. – SethMMorton Jan 24 '14 at 19:24

2 Answers2

13

There are three cases I can think of:

  • Generator has no side effects, and you just want to be able to walk back through results you've already captured. You could consider a cached generator instead of a true generator. You can shared the cached generator around as well, and if any client walks to an item you haven't been to yet, it will advance. This is similar to the tee() method, but does the tee functionality in the generator/cache itself instead of requiring the client to do it.

  • Generator has side effects, but no history, and you want to be able to restart anywhere. Consider writing it as a coroutine, where you can pass in the value to start at any time.

  • Generator has side effects AND history, meaning that the state of the generator at G(x) depends on the results of G(x-1), and so you can't just pass x back into it to start anywhere. In this case, I think you'd need to be more specific about what you are trying to do, as the result depends not just on the generator, but on the state of other data. Probably, in this case, there is a better way to do it.

Community
  • 1
  • 1
Corley Brigman
  • 11,633
  • 5
  • 33
  • 40
7

The comment for itertools.tee was my first guess as well. Because of the warning that you shouldn't advance the original generator any longer after using tee, I might write something like this to spin off a copy:

>>> from itertools import tee
>>>
>>> def foo():
...   i = 0
...   while i < 10:
...     i += 1
...     yield i
...
>>>
>>> it = foo()
>>> it.next()
1
>>> it, other = tee(it)
>>> it.next()
2
>>> other.next()
2
g.d.d.c
  • 46,865
  • 9
  • 101
  • 111