4

Say I have the following function

async def f1():
    async for item in asynciterator():
        return

What happens to the async iterator after

await f1()

? Should I worry about cleaning up or will the generator be somehow garbage collected when it goes out of sight?

Liviu
  • 1,023
  • 2
  • 12
  • 33
  • A guess: `f1()` returns the coroutine, which is just a callable object on the heap, including the function's frame (local variables etc.) Therefore, garbage collection should clean it up just fine. In this case you don't want `f1` to hold any external resources like file handles though – Felk Nov 08 '18 at 17:46
  • how about the asynciterator. In the asynciterator I am using aiohttp session as a context manager to perform a get. after I do the get, I parse the body and yield items from the body. Should I release the aiohttp session as soon as I receive the http response and before starting to parse and yield items from it ? – Liviu Nov 08 '18 at 18:01
  • Related: [loop.shutdown_asyncgens](https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.shutdown_asyncgens) and [sys.set_asyncgen_hooks](https://docs.python.org/3/library/sys.html#sys.set_asyncgen_hooks) – Vincent Nov 09 '18 at 13:18
  • More precisely, when an async generator is about to get garbage collected, asyncio schedules the `agen.aclose()` coroutine. – Vincent Nov 09 '18 at 13:30

1 Answers1

3

Should I worry about cleaning up or will the generator be somehow garbage collected when it goes out of sight?

TL;DR Python's gc and asyncio will ensure eventual cleanup of incompletely iterated async generators.

"Cleanup" here refers to running the code specified by a finally around the yield, or by the __aexit__ part of the context manager used in a with statement around the yield. For example, the print in this simple generator is invoked by the same mechanism used by a aiohttp.ClientSession to close its resources:

async def my_gen():
    try:
        yield 1
        yield 2
        yield 3
    finally:
        await asyncio.sleep(0.1)  # make it interesting by awaiting
        print('cleaned up')

If you run a coroutine that iterates through the whole generator, the cleanup will be executed immediately:

>>> async def test():
...     gen = my_gen()
...     async for _ in gen:
...         pass
...     print('test done')
... 
>>> asyncio.get_event_loop().run_until_complete(test())
cleaned up
test done

Note how the cleanup is executed immediately after the loop, even though the generator was still in scope without the chance to get garbage collected. This is because the async for loop ensures the async generator cleanup on loop exhaustion.

The question is what happens when the loop is not exhausted:

>>> async def test():
...     gen = my_gen()
...     async for _ in gen:
...         break  # exit at once
...     print('test done')
... 
>>> asyncio.get_event_loop().run_until_complete(test())
test done

Here gen got out of scope, but the cleanup simply didn't occur. If you tried this with an ordinary generator, the cleanup would get called by the reference countered immediately (though still after the exit from test, because that's when the running generator is no longer referred to), this being possible because gen does not participate in a cycle:

>>> def my_gen():
...     try:
...         yield 1
...         yield 2
...         yield 3
...     finally:
...         print('cleaned up')
... 
>>> def test():
...     gen = my_gen()
...     for _ in gen:
...         break
...     print('test done')
... 
>>> test()
test done
cleaned up

With my_gen being an asynchronous generator, its cleanup is asynchronous as well. This means it can't just be executed by the garbage collector, it needs to be run by an event loop. To make this possible, asyncio registers the asyncgen finalizer hook, but it never gets a chance to execute because we're using run_until_complete which stops the loop immediately after executing a coroutine.

If we tried to spin the same event loop some more, we'd see the cleanup executed:

>>> asyncio.get_event_loop().run_until_complete(asyncio.sleep(0))
cleaned up

In a normal asyncio application this does not lead to problems because the event loop typically runs as long as the application. If there is no event loop to clean up the async generators, it likely means the process is exiting anyway.

user4815162342
  • 141,790
  • 18
  • 296
  • 355
  • 2
    If you use [`asyncio.run()`](https://docs.python.org/3.9/library/asyncio-task.html#asyncio.run) instead of `asyncio.run_until_complete()`, it will automatically wait for async generators to clean up before returning. – Maxpm Aug 02 '21 at 13:53
  • 1
    There's also [PEP 533](https://www.python.org/dev/peps/pep-0533/), which is supposed to make async generators get cleaned up at a deterministic time, rather than whenever they happen to be garbage-collected. Unfortunately, that PEP is currently "deferred" for reasons unknown to me. – Maxpm Aug 02 '21 at 13:57