1

Let's say that we have to use the length of a list in certain calculations in a loop. Which would be faster, using len(list_) in each calculation or storing the length length = len(list_) and then using length? For example:

for x in range(n):
    print(len(list_) + 1)

Versus

length = len(list_)
for x in range(n):
    print(length + 1)

Assume a generic situation (n can be any value).

Tagc
  • 8,736
  • 7
  • 61
  • 114
Noah May
  • 1,049
  • 8
  • 18
  • 5
    Why don't you test it out using `time.time()` and report back to us? – blacksite Jan 08 '17 at 02:07
  • 5
    Even better, there's an entire [timeit](https://docs.python.org/3.6/library/timeit.html) module designed to help with this, which avoids a lot of the problems just using `time.time()` can introduce. – DSM Jan 08 '17 at 02:11
  • 3
    The only difference between those code blocks is accessing a variable and calling it as a function with another accessed variable vs simply accessing a variable. The only difference between those is that the former calls a function and accesses a second variable while the latter does not. Which do you think is faster? After you've made an educated guess, run some tests and find out. Then ignore the result and use whatever code looks cleaner. – TigerhawkT3 Jan 08 '17 at 02:17
  • 1
    The len function is pretty fast. It calls the `.__len__` method which just does an attribute lookup on built-in containers. Stashing the length in a local should be slightly faster, but don't clutter your code like that unless it's a huge loop. – PM 2Ring Jan 08 '17 at 02:23
  • Should be the same with small dataset, compute time can only be noticed with large dataset. However the latter is better. The first looks like it will have to repeat that `len` function for `n` number of times. – Akinjide Jan 08 '17 at 02:27
  • 1
    What you should actually be doing is saving `len(list_) + 1`, and giving that variable a name that represents its purpose in your code. – TigerhawkT3 Jan 08 '17 at 02:34

3 Answers3

2

Here's a simple test using timeit as @DSM suggested:

def direct_len(lst):
    total = 0
    for x in range(1000):
        total += len(lst) + 1

def precalc(lst):
    length = len(lst)
    total = 0
    for x in range(1000):
        total += length + 1

if __name__ == '__main__':
    import timeit
    print(timeit.timeit("direct_len(list(range(100)))", setup="from __main__ import direct_len", number=10000))
    print(timeit.timeit("precalc(list(range(100)))", setup="from __main__ import precalc", number=10000))

With above I get following result with Python 3.5 on Windows 8:

1.3909554218576217
0.8262501212985289
niemmi
  • 17,113
  • 7
  • 35
  • 42
  • I knew it. The first looks like it will have to repeat that `len` function for `n` number of times. – Akinjide Jan 08 '17 at 02:28
  • 1
    Note that you're timing a lot of identical overhead stuff, which masks the performance difference in the code that's actually different. – TigerhawkT3 Jan 08 '17 at 02:31
  • Out of interest I wonder if it would make much difference in the results if you ran this program with one of the tests commented out and again with the other test commented out and compared the results that way. – Tagc Jan 08 '17 at 02:37
  • @TigerhawkT3 Yes of course, I just decided to use examples provided in the question. Comparing these numbers with only variable access/`len` call as in your example shows how fast the difference becomes meaningless when some functionality is added. – niemmi Jan 08 '17 at 02:37
  • @Tagc I fail to see why that would change anything. Anyway I tried it and results seem to be same if I comment one of the tests out. Note though that there's a slight variance between every run in any case. – niemmi Jan 08 '17 at 02:41
  • @niemmi In case the interpreter makes different optimisations if both implementations are present (vs. just one or the other in a real-world scenario), or if executing the first test causes caching that results in optimistic timing for the second test. I was just curious and like you I didn't find any significant deviation. – Tagc Jan 08 '17 at 02:46
0

Accessing a single stored variable is much faster than accessing a function and passing it an accessed variable.

>>> import timeit
>>> timeit.timeit('x', setup='x=len([1,2])')
0.024496269777304097
>>> timeit.timeit('len(x)', setup='x=[1,2]')
0.10009170159894687

However, as I said in my comment above, it doesn't matter. It might matter if the function you're calling is extremely expensive, but that's not the case this time. Use whatever makes your code look cleaner.

Community
  • 1
  • 1
TigerhawkT3
  • 48,464
  • 6
  • 60
  • 97
-1

Python's list store its length in a variable, so there is no big difference in this two way.

The first get the length(which is just a variable, there is no any calculation in len()function) every time in the loop, the second just do the function once. The time is same in my test.

scriptboy
  • 777
  • 8
  • 25
  • 3
    Why don't you tell us what your test was, so that we can see why they were the same? In my test, accessing a saved length was about four times faster than repeatedly calling `len`. – TigerhawkT3 Jan 08 '17 at 02:25