4

Until now, when I used the len function with various container types (let's say the list type for now), I assumed that each container type has a field member which stores the length of that particular object.. Coming from Java, this made a lot of sense. But when I come to think about it, I don't think this is true, this made me confused.

Whenever I'm using the len function on an object which implement __length__, does it calculates the length by iterating on the object's elements, or just returning the length somehow immediately?

The question came to me actually from using the dict built-in type. I added some elements (a lot of them) to the dictionary and eventually I needed to get the amount of elements in the dictionary, so because I'm not sure what is the time complexity of the len function, I decided to count the elements as I insert them... but I'm not sure this is the right solution to my problem.

This is an example code for my question:

d = {}
count = 0
for i in range(10 ** 6):
    d[i] = True
    count += 1

VS

d = {i: True for i in range(10 ** 6)}
count = len(d)

Second solution looks nicer (and shorter) to me... and I know that theoretically the time complexity is the same whether the len function is instant or not, in the second solution I'm afraid it iterates twice to 10 ** 6 (first for the dictionary comprehension, and second for the length calculation).

Enlighten me please.

rboy
  • 750
  • 9
  • 20
  • 2
    AFAIK all built-ins(and standard library types) provide `O(1)` `__len__` implementations. Even `deque` which is a doubly-linked list. However for custom types the `__len__` can have different complexity. Anyway you *should* always use it, because *using `len` conveys the meaning of what you want to do*, while other solutions are much less readable and with any reasonable type they are even slower/equal in speed. – Bakuriu Oct 17 '13 at 11:46
  • I actually made a search but I probably didn't search for the right terms. Thank you for the reference. – rboy Oct 17 '13 at 11:52

1 Answers1

10

You are very definitely over-thinking this. Python is not really the language that you should be using if you're worried about optimising at this level.

That said, on the whole Python's containers do know their own lengths, without having to iterate. The built-in types are implemented in C (in the CPython implementation), and I'd have to dig into the actual code to find out exactly where it's implemented, but len is always a constant-time call.

Daniel Roseman
  • 588,541
  • 66
  • 880
  • 895
  • If I happen to use the `len` function a lot in the same code on the same object, this "optimization thinking" is justified. Usually I would just use a variable to store that length, but let's say I have a function which populates an empty dictionary, and in the end it returns that dictionary, if the complexity of `len` would be O(n), I would return the length of the dictionary from that function as well (if the length is needed), but if not and it's O(1) as said, I wouldn't and just use `len` on it when necessary. – rboy Oct 17 '13 at 12:00