11

A similar question has already been ask Cost of len() function here. However, this question looks at the cost of len it self. Suppose, I have a code that repeats many times len(List), every time is O(1), reading a variable is also O(1) plus assigning it is also O(1).

As a side note, I find that n_files = len(Files) is somewhat more readable than repeated len(Files) in my code. So, that is already an incentive for me to do this. You could also argue against me, that somewhere in the code Files can be modified, so n_files is no longer correct, but that is not the case.

My question is:
Is the a number of calls to len(Files) after which accessing n_files will be faster?

Community
  • 1
  • 1
oz123
  • 27,559
  • 27
  • 125
  • 187
  • 1
    Given that Python will have to go and look up `len` each time, the number is probably 1. But why not do some tests and find out? As both approaches are `O(1)` this is all about the fixed costs. – jonrsharpe Jul 22 '15 at 10:06
  • I thought that `len` would be cheaper compared to the creation of a new `int` instance ... – oz123 Jul 22 '15 at 10:08
  • @jonrsharpe Then what about doing `len_func = len` and then calling `len_func(Files)`? – DeepSpace Jul 22 '15 at 10:09
  • Note that small integers are interned in CPython, so if your list is shorter than `255` (IIRC) you'll get the same instance. – jonrsharpe Jul 22 '15 at 10:09
  • @DeepSpace that's a good question - locally aliasing functions is one way to boost performance, and that may make the comparison much tighter. There's still the function *call*, though. – jonrsharpe Jul 22 '15 at 10:09
  • @DeepSpace Doesn't that require that you'll have to lookup `len_func` instead? – skyking Jul 22 '15 at 10:41
  • @skyking yes, but the local lookup is faster than the global one. See e.g. http://stackoverflow.com/q/20388114/3001761 – jonrsharpe Jul 22 '15 at 10:42
  • What are you actually doing that you worried about an `0(1)` operation? – Padraic Cunningham Jul 22 '15 at 10:55
  • 1
    @PadraicCunningham, because `O(1)` can be 5 ms, or 500ms, and I am trying to understand Python better, not just to be a `coder`... Asking this question here will also (hopefully, help others too...) – oz123 Jul 22 '15 at 11:18
  • https://wiki.python.org/moin/PythonSpeed/PerformanceTips – Padraic Cunningham Jul 22 '15 at 11:51

4 Answers4

14

A few results (time, in seconds, for one million calls), with a ten-element list using Python 2.7.10 on Windows 7; store is whether we store the length or keeping calling len, and alias is whether or not we create a local alias for len:

Store Alias n=      1      10     100
Yes   Yes       0.862   1.379   6.669
Yes   No        0.792   1.337   6.543
No    Yes       0.914   1.924  11.616
No    No        0.879   1.987  12.617

and a thousand-element list:

Store Alias n=      1      10     100
Yes   Yes       0.877   1.369   6.661
Yes   No        0.785   1.299   6.808
No    Yes       0.926   1.886  11.720
No    No        0.891   1.948  12.843

Conclusions:

  • Storing the result is more efficient than calling len repeatedly, even for n == 1;
  • Creating a local alias for len can make a small improvement for larger n where we aren't storing the result, but not as much as just storing the result would; and
  • The influence of the length of the list is negligible, suggesting that whether or not the integers are interned isn't making any difference.

Test script:

def test(n, l, store, alias):
    if alias:
        len_ = len
        len_l = len_(l)
    else:
        len_l = len(l)
    for _ in range(n):
        if store:
            _ = len_l
        elif alias:
            _ = len_(l)
        else:
            _ = len(l)

if __name__ == '__main__':
    from itertools import product
    from timeit import timeit
    setup = 'from __main__ import test, l'
    for n, l, store, alias in product(
        (1, 10, 100),
        ([None]*10,),
        (True, False),
        (True, False),
    ):
        test_case = 'test({!r}, l, {!r}, {!r})'.format(n, store, alias)
        print test_case, len(l),
        print timeit(test_case, setup=setup)
jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
2

Function calls in python are costly, so if you are 100% sure that the size of n_files would not change when you are accessing its length from the variable, you can use the variable, if that is what is more readable for you as well.

An Example performance test for both accessing len(list) and accessing from variable , gives the following result -

In [36]: l = list(range(100000))

In [37]: n_l = len(l)

In [40]: %timeit newn = len(l)
10000000 loops, best of 3: 92.8 ns per loop

In [41]: %timeit new_n = n_l
10000000 loops, best of 3: 33.1 ns per loop

Accessing the variable is always faster than using len() .

Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
1

Using l = len(li) is faster:

python -m timeit -s "li = [1, 2, 3]" "len(li)"
1000000 loops, best of 3: 0.239 usec per loop

python -m timeit -s "li = [1, 2, 3]; l = len(li)" "l"
10000000 loops, best of 3: 0.0949 usec per loop
f43d65
  • 2,264
  • 11
  • 15
1

Using len(Files) instead of n_files is likely to be slower. Yes you have to lookup n_files, but in the former case you'll have to lookup both len and Files and then on top of that call a function that "calculates" the length of Files.

skyking
  • 13,817
  • 1
  • 35
  • 57
  • What do you feel this adds to the previous answers? – jonrsharpe Jul 22 '15 at 10:42
  • @jonrsharpe It's an theoretical explaination why it should be faster. Two of the answers are more or less just experimental test to see which is fastest and Anands only points out that function calls are costly (which it may be). – skyking Jul 22 '15 at 10:46
  • 1
    Removed some of the answer which was false Currently, Python does not calculate the length of lists. It is strored as a varaible within the object. https://github.com/python/cpython/blob/2.5/Objects/listobject.c#L379 – DonCarleone Dec 23 '20 at 04:54
  • @DonCarleone In this case "calculation" can be considered as just O(1) operation – fdermishin Dec 23 '20 at 11:42
  • @fdermishin cal·cu·late /ˈkalkyəˌlāt/ verb 1. determine (the amount or number of something) _mathematically_. – DonCarleone Dec 23 '20 at 15:58