0

This is a performance question. I am trying to optimize the following double for loop. Here is a MWE

import numpy as np 
from timeit import default_timer as tm

# L1 and L2 will range from 0 to 3 typically, sometimes up to 5
# all of the following are dummy values but match correct `type`
L1, L2, x1, x2, fac = 2, 3, 2.0, 4.5, 2.3
saved_values = np.random.uniform(high=75.0, size=[max(L1,L2) + 1, max(L1,L2) + 1]) 
facts = np.random.uniform(high=65.0, size=[L1 + L2 + 1])
val = 0

start = tm()

for i in range(L1+1):
    sf = saved_values[L1][i] * x1 ** (L1 - i)
    for j in range(L2 + 1):
        m = i + j
        if m % 2 == 0:
            num = sf * facts[m] / (2 * fac) ** (m / 2)
            val += saved_values[L2][j] * x1 ** (L1 - j) * num

end = tm()
time = end-start

print("Long way: time taken was {} and value is {}".format(time, val))

My idea for a solution is to take out the if m % 2 == 0: statement and then calculate all i and j combinations i.e., a matrix, which I should be able to vectorize, and then use something like np.where() to add up all of the elements meeting the requirement of if m % 2 == 0: where m= i+j.

Even if this is not faster than the explicit for loops, it should be vectorized because in reality I will be sending arrays to a function containing the double for loops, so being able to do that part vectorized, should get me the speed gains I am after, even if vectorizing this double for loop does not.

I am stuck spinning my wheels right now on how to broadcast, but account for the sf factor as well as the m factor in the inner loop.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
Dilbert
  • 101
  • 2
  • 1
    If your goal is to speed up this computation, vectorization will probably not help a lot since L1 and L2 are very small (only 12 values and 6 computed). Indeed, the downside of vectorization is the (slow) allocation of additional arrays. – Jérôme Richard Jun 26 '20 at 11:11
  • @JérômeRichard I am not that concerned with speed, it doesn't need to speed it up, just vectorize it in a way that isn't overtly slower. Also, the problem can be made bigger. In reviewing, the size could likely go up to dimensions 4!, but, probably 3! or smaller will be more typical. For 4! this would be a 24x24 matrix, which is not all that small (but not large either) – Dilbert Jun 26 '20 at 16:42

0 Answers0