I have two 3-dimensional arrays, A, B, where
- A has dimensions (500 x 500 x 80), and
- B has dimensions (500 x 80 x 2000).
In both arrays the dimension that has the size 80 can be called 'time' (e.g. 80 timepoints i
). The dimension that has the size 2000 can be called 'scenario' (we have 2000 scenario
s).
What I need to do is to take 500 x 500 matrix A[:, :, i]
and multiply by it each 500-element column vector at a corresponding time B[:, i, scenario]
for each scenario
and time i
.
I eventually ended up with the code below
from scipy.stats import norm
import numpy as np
A = norm.rvs(size = (500, 500, 80), random_state = 0)
B = norm.rvs(size = (500, 80, 2000), random_state = 0)
result = np.einsum('ijk,jkl->ikl', A, B, optimize=True)
while a naive approach would for the same problem be to use a nested for loop
for scenario in range(2000):
for i in range(80):
out[:, i, scenario] = A[:, :, i] @ B[:, i, scenario]
I expected einsum
to be quite fast because the problem 'only' involves simple operations on a large array but it actually runs for several minutes.
I compared the speed of the einsum
above to the case where we assume that each matrix in A is the same, we can keep A as a (500 x 500) matrix (instead of a 3d array), and then the whole problem can be written as
A = norm.rvs(size = (500, 500), random_state = 0)
B = norm.rvs(size = (500, 80, 2000), random_state = 0)
result = np.einsum('ij,jkl->ikl', A, B, optimize=True)
This is fast and only runs for a few seconds. Much faster than the 'slightly' more general case above.
My question is - do I write the general case with the slow einsum
in a computationally efficient form?