1

Consider these examples using Python calling several functions returning a number, using an array element:

import numpy as np

my_array = np.zeros(looplength)

for j in range(0,looplength):

    temp = my_first_function(j)
    my_array[j] = temp
    a_1 = my_first_function(j,temp)
    a_2 = my_second_function(j,temp)
    a_3 = my_third_function(j,temp)
    ....
    a_N = my_Nth_function(j,temp)

vs

import numpy as np

my_array = np.zeros(looplength)

for j in range(0,looplength):

    my_array[j] = my_first_function(j)
    a_1 = my_first_function(j,my_array[j])
    a_2 = my_second_function(j,my_array[j])
    a_3 = my_third_function(j,my_array[j])
    ....
    a_N = my_Nth_function(j,my_array[j])

My question is: for performance, is it better to use a copy in a temporary variable or an access an array element directly, if this happens repeatedly? Also: how often does an array element need to be accessed for it to become faster to copy it?

Grismar
  • 27,561
  • 4
  • 31
  • 54
Siderius
  • 174
  • 2
  • 14

1 Answers1

0

Since my_array could be modified inside the call to my_<nth>_function, there is no way for the compiler to optimise each instance of my_array[j] and therefore the lookup has to be performed every single time.

As a result, extracting the value into a local variable for reuse may perform better, somewhat depending on the used Python compiler and the number of uses. But I'd venture that anything over three uses probably makes it worth it, just for performance.

For readability, you could consider doing it sooner, even if there's only two uses, because even if that's a tiny bit slower (creating the variable, assigning the value, cleaning it up) it may improve your code. But that's a different matter.

You don't have to take my word for it though, try this:

from timeit import timeit


def number(x):
    return x + 1


def n_times(n):
    a = [0, 1, 2]
    for i in range(n):
        b = number(a[1])


def n_times_copied(n):
    a = [0, 1, 2]
    x = a[1]
    for i in range(n):
        b = number(x)


def main():
    for n in range(10):
        print(f'n is {n}, access: {timeit(lambda: n_times(n))}')
        print(f'n is {n}, copied: {timeit(lambda: n_times_copied(n))}')


main()

My results:

n is 0, access: 0.2166731
n is 0, copied: 0.22873619999999995
n is 1, access: 0.3126325000000001
n is 1, copied: 0.34623590000000004
n is 2, access: 0.4013982999999999
n is 2, copied: 0.3592341000000001
n is 3, access: 0.5191629
n is 3, copied: 0.39491809999999994
n is 4, access: 0.4818688999999998
n is 4, copied: 0.4481706000000001
n is 5, access: 0.5782233999999997
n is 5, copied: 0.5087457999999998
n is 6, access: 0.6317819
n is 6, copied: 0.5696268
n is 7, access: 0.7247358000000004
n is 7, copied: 0.6597318000000003
n is 8, access: 0.7239683000000001
n is 8, copied: 0.6870645999999994
n is 9, access: 0.8341450000000012
n is 9, copied: 0.7839662999999994

These are results on Standard Python 3.9.7 (which I happened to have installed in my sandbox), running on Windows 10 on an Intel Core i9-9900K.

Note that there's a lot of variation (each call to timeit runs the passed callable a million times), but for me, it flips on "n is 2" or "n is 3", depending on the attempt.

Grismar
  • 27,561
  • 4
  • 31
  • 54
  • Please focus your question only on the computational side, not on my language. If you find any "pompous" expression feel free to modify the body of the question making it "more simpler". – Siderius Oct 26 '21 at 21:28
  • Please remove that side note. You have improved the body of the question. – Siderius Oct 27 '21 at 10:45