-1

I have to execute the below operation several thousand times and it is slowing down my code substantially:

T = 50
D = 10
K = 20

x = np.random.randn(T, D)
y = np.random.randn(T, K)

result = np.zeros((K, D))

for k in range(K):
    for t in range(T):
        result[k] += y[t, k] * x[t]  # Multiply scalar element in y with row in x

Basically i'm trying to add up each element in column k of matrix y with the corresponding row in x and sum them up. I tried using np.einsum() to solve this:

result = np.einsum("ij,ik->jk", y, x)

which at least gives me result.shape == (K, D), but the results don't match! How can i efficiently perform this operation? Is this even possible with np.einsum()?

Tim Hilt
  • 605
  • 5
  • 23

1 Answers1

1

Those operations are the same. You already found the (likely fastest) vectorized operation.

T = 50
D = 10
K = 20

x = np.random.randn(T, D)
y = np.random.randn(T, K)

result = np.zeros((K, D))

for k in range(K):
    for t in range(T):
        result[k] += y[t, k] * x[t]
           
result2 = np.einsum("ij,ik->jk", y, x)

np.allclose(result, result2)
Out[]: True

Likely the problem is floating-point errors in whatever method you used to determine if they were "the same." np.allclose() is the solution to that. It rounds off the very small errors that occur between different methods of calculations using floats.

As the @QuangHoang states in the comments though, y.T @ x is much more readable

Daniel F
  • 13,620
  • 2
  • 29
  • 55
  • Oh man, i really should have used `np.allclose()`! Thanks for reminding me of that function. Also i guess figuring out that my goal was a simple matrix multiplication was too hidden in the paper i tried to implement. Thanks for the answer! – Tim Hilt Nov 07 '20 at 07:36