11

Python has many built-in functions, and len() is one of them.

Return the length (the number of items) of an object. The argument may be a sequence (such as a string, bytes, tuple, list, or range) or a collection (such as a dictionary, set, or frozen set).

If collections and sequences are objects, they could hold a length attribute that can be updated every time something changes. Accessing this attribute would be a fast way to retrieve the collection's length.

Another approach is to iterate through the collection and count the number of items on the fly.

How does len() calculates said length? Through iteration or attribute access? One, none, both, other approaches?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
NPN328
  • 1,863
  • 3
  • 27
  • 47

1 Answers1

20

Python built-in collections cache the length in an attribute. Using len() will not iterate over all elements to count them, no.

Keeping the length as an attribute is cheap and easy to maintain. Given that the Python built-in collection types are used so widely, it'd be unwise for them not to do this.

Python built-in types with a length are typically built on top of the PyObject_VAR_HEAD struct, which includes an ob_size entry. The Py_SIZE macro can then be used to implement the object.__len__ method (e.g. the PySequenceMethods.sq_length slot in the C-API). See the listobject.c implementation for list_length for example.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • 1
    Sometimes even, the length of a collection is not the same as the length of its iterator. E.g. a pandas DataFrame returns as length the number of rows, while its iterator iterates over the columns! – kadee Mar 07 '19 at 15:27
  • @kadee: so? You can also iterate over the rows in Pandas, and several other iterations to boot. What has that got to do with specifying a length? – Martijn Pieters Mar 07 '19 at 15:31
  • @kadee (and if you need to know the number of columns, use `len(dataframe.columns)`). – Martijn Pieters Mar 07 '19 at 15:31
  • It means that 'iterating through the collection and counting the number of items' as proposed by @Jamm would yield a different result than given by len(). In other words `len(dataframe) != len(list(dataframe.__iter__()))` I just found that remarkable. – kadee Mar 07 '19 at 16:16