22

This may be a stupid question but I will ask it anyway. I have a generator object:

>>> def gen():
...     for i in range(10):
...         yield i
...         
>>> obj=gen()

I can measure it's size:

>>> obj.__sizeof__()
24

It is said that generators get consumed:

>>> for i in obj:
...     print i
...     
0
1
2
3
4
5
6
7
8
9
>>> obj.__sizeof__()
24

...but obj.__sizeof__() remains the same.

With strings it works as I expected:

>>> 'longstring'.__sizeof__()
34
>>> 'str'.__sizeof__()
27

I would be thankful if someone could enlighten me.

root
  • 76,608
  • 25
  • 108
  • 120

4 Answers4

43

__sizeof__() does not do what you think it does. The method returns the internal size in bytes for the given object, not the number of items a generator is going to return.

Python cannot beforehand know the size of a generator. Take for example the following endless generator (example, there are better ways to create a counter):

def count():
    count = 0
    while True:
        yield count
        count += 1

That generator is endless; there is no size assignable to it. Yet the generator object itself takes memory:

>>> count.__sizeof__()
88

You don't normally call __sizeof__() you leave that to the sys.getsizeof() function, which also adds garbage collector overhead.

If you know a generator is going to be finite and you have to know how many items it returns, use:

sum(1 for item in generator)

but note that that exhausts the generator.

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
8

As said in other answers, __sizeof__ returns a different thing.

Only some iterators have methods that return the number of not returned elements. For example listiterator has a corresponding __length_hint__ method:

>>> L = [1,2,3,4,5]
>>> it = iter(L)
>>> it
<listiterator object at 0x00E65350>
>>> it.__length_hint__()
5
>>> help(it.__length_hint__)
Help on built-in function __length_hint__:

__length_hint__(...)
    Private method returning an estimate of len(list(it)).

>>> it.next()
1
>>> it.__length_hint__()
4
ovgolovin
  • 13,063
  • 6
  • 47
  • 78
  • thanks for the __length_hint__. also len(list(it)) consumes an iterator but __length_hint__ does not. – root Sep 18 '12 at 13:49
  • @root Yes, `list` consumes it, converts to list (which takes some memory) and only then calculates the length of the created list. While `length_hint` is a specially implemented method just for list iterator object. – ovgolovin Sep 18 '12 at 13:52
1

__sizeof__ returns the memory size of an object in bytes, not the length of a generator, which is impossible to determine up front as generators can grow indefinitely.

Hans Then
  • 10,935
  • 3
  • 32
  • 51
0

If you are certain that the generator you've created is "finite" (has a countable number of elements) and you don't mind waiting a while you can use the following to get what you want:

len(list(gen()))

As the other posters said __sizeof__() is a measure of how much memory something takes up (a much lower level concept that you probably will rarely need), not its length (which is not a feature of generators since there's no guarantee they have a countable length).

Chris Pfohl
  • 18,220
  • 9
  • 68
  • 111