6

I have a generator gen, with the following properties:

  • it's quite expensive to make it yield (more expensive than creating the generator)
  • the elements take up a fair amount of memory
  • sometimes all of the __next__ calls will throw an exception, but creating the generator doesn't tell you when that will happen

I didn't implement the generator myself.

Is there a way to make the generator yield its first element (I will do this in a try/except), without having the generator subsequently start on the second element if I loop through it afterwards?

I thought of creating some code like this:

try:
    first = next(gen)
except StopIterator:
    return None
except Exception:
    print("Generator throws exception on a yield")

# looping also over the first element which we yielded already
for thing in (first, *gen):
    do_something_complicated(thing)

Solutions I can see which are not very nice:

  1. Create generator, test first element, create a new generator, loop through the second one.
  2. Put the entire for loop in a try/except; not so nice because the exception thrown by the yield is very general and it would potentially catch other things.
  3. Yield first element, test it, then reform a new generator from the first element and the rest of gen (ideally without extracting all of gen's elements into a list, since this could take a lot of memory).

For 3, which seems like the best solution, a nearly-there example would be the example I gave above, but I believe that would just extract all the elements of gen into a tuple before we start iterating, which I would like to avoid.

Marses
  • 1,464
  • 3
  • 23
  • 40
  • Every iterator should have a `gen.__length_hint__()` function which returns the amount of remaining elements. But you have to handle it with care since it is a hint and might not contain the true length. – areop-enap Dec 12 '22 at 12:35
  • @areop-enap: Not all iterators have a `__length_hint__`. In particular, generators don't have one. – user2357112 Dec 12 '22 at 12:37
  • For option 3, see [`itertools.chain`](https://docs.python.org/3/library/itertools.html#itertools.chain). – user2357112 Dec 12 '22 at 12:37
  • 3
    How about creating a new class which takes a generator as an argument, it can have an internal queue which can help facilitate a new `peek()` operation(which can call next() on passed generator and then save the value in the queue). If the queue is non-empty, you pop from queue, else yield directly from the generator. Would make the generator slightly more expensive, but given your generator is already quite expensive should be fine. – Jay Dec 12 '22 at 12:49
  • 1
    I guess this is what you're suggesting Jay. My issue really stems from the fact that I wish I could have a try/except in the actual for statement; I would really need something like `for try: thing in gen except Exception: continue`, which is impossible. So I guess the better option would be to create a new generator that wraps the old one and does `yield next(gen)`, and has a try except there to catch the exceptions. My ideal way of handling the exceptions would be to `continue` the loop, so I guess I could make it yield a placeholder object to know when I should continue in the actual loop. – Marses Dec 12 '22 at 13:02
  • @Marses: The fact that you want to `continue` the loop suggests that you think there would be a next iteration. If a generator raises an exception, there won't *be* a next iteration. The generator is done. It cannot yield more elements. – user2357112 Dec 13 '22 at 03:36
  • Ah I see what you mean; so even if the exception thrown when yielding is not StopIteration but something else, it still wouldn't be able to continue? I didn't think of that, true. – Marses Dec 13 '22 at 12:28

1 Answers1

3

I think I have what you are looking for using more_itertools library:

import more_itertools

if __name__ == "__main__":
    generator = range(100)

    peekable_generator = more_itertools.peekable(generator)

    print(f"peek {peekable_generator.peek()}")
    print(f"next {next(peekable_generator)}")
    print(f"next {next(peekable_generator)}")

output: 
    peek 0
    next 0
    next 1

See documentation here: https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.peekable

If I'm not mistaken the ability to peek at the first item is the key thing you need.

user459742
  • 46
  • 3