19

It is stated in the Python documentation that one of the advantages of namedtuple is that it is as memory-efficient as tuples.

To validate this, I used iPython with ipython_memory_usage. The test is shown in the images below:

The test shows that:

  • 10000000 instances of namedtuple used about 850 MiB of RAM
  • 10000000 tuple instances used around 73 MiB of RAM
  • 10000000 dict instances used around 570 MiB of RAM

So namedtuple used much more memory than tuple! Even more than dict!!

What do you think? Where did I go wrong?

Ammar Alyousfi
  • 4,112
  • 5
  • 31
  • 42
  • 4
    I don't have a clear answer to your question, but it's possible that the peephole optimizer noticed that your tuple is defined as a literal with immutable members and gave you back a list of references to the same tuple. – mgilson Dec 06 '16 at 19:18
  • @Chinny84 -- Actually, I'm _really_ surprised that the dictionary takes less memory than a named-tuple. I know that if you're working in python3.6, dictionaries have been upgraded with a new implementation that should be more memory efficient, but I still don't think that should beat a tuple... – mgilson Dec 06 '16 at 19:20
  • @mgilson That's probably because the class returned by `namedtuple()` has some Python level attributes, on the other hand `dict` is still pure C. – Ashwini Chaudhary Dec 06 '16 at 19:34
  • 1
    Like mgilson mentioned, try to create the tuples dynamically. CPython can cache literals of immutable objects, unfortunately namedtuple doesn't have a literal and hence it can't be cached. – Ashwini Chaudhary Dec 06 '16 at 19:36
  • @AshwiniChaudhary -- What instance level attributes does it have though? You can see the code used to generate a named tuple by passing `verbose=True` to the constructor... All of the attributes are defined at the class level -- And there's only one class no matter how many instances of it you make... – mgilson Dec 06 '16 at 20:52
  • 1
    @mgilson: A quick check shows your hypothesis is correct. The construction of `(1, 2, 3)` gets constant-folded, and all `append` calls in the loop append the same tuple. – user2357112 Dec 08 '16 at 17:56

2 Answers2

26

A simpler metric is to check the size of equivalent tuple and namedtuple objects. Given two roughly analogous objects:

from collections import namedtuple
import sys

point = namedtuple('point', 'x y z')
point1 = point(1, 2, 3)

point2 = (1, 2, 3)

Get the size of them in memory:

>>> sys.getsizeof(point1)
72

>>> sys.getsizeof(point2)
72

They look the same to me...


Taking this a step further to replicate your results, notice that if you create a list of identical tuples the way you're doing it, each tuple is the exact same object:

>>> test_list = [(1,2,3) for _ in range(10000000)]
>>> test_list[0] is test_list[-1]
True

So in your list of tuples, each index contains a reference the same object. There are not 10000000 tuples, there are 10000000 references to one tuple.

On the other hand, your list of namedtuple objects actually does create 10000000 unique objects.

A better apples-to-apples comparison would be to view the memory usage for

>>> test_list = [(i, i+1, i+2) for i in range(10000000)]

and:

>>> test_list_n = [point(x=i, y=i+1, z=i+2) for i in range(10000000)]

They have the same size:

>>> sys.getsizeof(test_list)
81528056

>>> sys.getsizeof(test_list_n)
81528056
tekumara
  • 8,357
  • 10
  • 57
  • 69
Billy
  • 5,179
  • 2
  • 27
  • 53
  • 1
    Which interestingly is the same size as a dictionary: >>> test_list_d = [{"x":i, "y":i+1, "z":i+2} for i in range(10000000)] >>> sys.getsizeof(test_list_d) 81528056 – tekumara Dec 30 '17 at 03:29
  • 2
    Thats because youre always just counting the size of a generator object and not the resulting data structure – powo Oct 05 '18 at 06:26
10

Doing some investigation myself (with Python 3.6.6). I run into following conclusions:

  1. In all three cases (list of tuples, list of named tuples, list of dicts). sys.getsizeof returns size of the list, which stores only references anyway. So you get size: 81528056 in all three cases.

  2. Sizes of elementary types are:

    sys.getsizeof((1,2,3)) 72

    sys.getsizeof(point(x=1, y=2, z=3)) 72

    sys.getsizeof(dict(x=1, y=2, z=3)) 240

  3. timing is very bad for named tuple:
    list of tuples: 1.8s
    list of named tuples: 10s
    list of dicts: 4.6s

  4. Looking to system load I become suspicious about results from getsizeof. After measuring the footprint of the Ptyhon3 process I get:

    test_list = [(i, i+1, i+2) for i in range(10000000)]
    increase by: 1 745 564K
    that is about 175B per element

    test_list_n = [point(x=i, y=i+1, z=i+2) for i in range(10000000)]
    increase by: 1 830 740K
    that is about 183B per element

    test_list_n = [point(x=i, y=i+1, z=i+2) for i in range(10000000)]
    increase by: 2 717 492 K
    that is about 272B per element

Jan Brezina
  • 123
  • 1
  • 7