I need some clarification on what is going on with the following behavior when using a generator. I'm afraid I am missing something fundamental, so any advice is welcome.
(edit) Specifically my question deals with the del
of the iterable that an iterator is created on.
My ultimate use case is that I am iterating over a pretty massive corpus for text processing. It isn't so large that it can't be held in memory, but is big enough that training a subsequent model is impossible with it in memory.
So, in my investigation, I attempted the following and I'm confused how this works.
>>> iterable = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
>>> iterable.__sizeof__()
160
>>> iterator = (x+1 for x in iterable)
>>> iterator
<generator object <genexpr> at 0x1019f8570>
>>> iterator.__sizeof__()
64
>>> del iterable
>>> for i in iterator:
... print(i)
...
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Once I delete the iterable, what is the iterator referencing? How is it still able to have a smaller mem footprint but also succesfully execute (I was under the impression that the generator simply points to existing data, but if it is deleted I have to shrug?)
What am I missing? (something clearly. Sorry. I'm self taught). Thanks in advance!