Efficient way to compute an array matrix multiplication for a batch of arrays

Question

I want to parallelize the following problem. Given an array w with shape (dim1,) and a matrix A with shape (dim1, dim2), I want each row of A to be multiplied for the corresponding element of w. That's quite trivial.

However, I want to do that for a bunch of arrays w and finally sum the result. So that, to avoid the for loop, I created the matrix W with shape (n_samples, dim1), and I used the np.einsum function in the following way:

x = np.einsum('ji, ik -> jik', W, A))
r = x.sum(axis=0)

where the shape of x is (n_samples, dim1, dim2) and the final sum has shape (dim1, dim2).

I noticed that np.einsum is quite slow for a large matrix A. Is there any more efficient way of solving this problem? I also wanted to try with np.tensordot but maybe this is not the case.

Thank you :-)

You could omit the ```j``` subscript in the output and there will be no need to sum the array as an extra step. — Kevin, Apr 04 '21 at 23:53
Tensordot is really just a generalization of einsum. If you want explicity, you should use einsum. Most optimizations are done under the hood anyways in numpy :) — Kevin, Apr 04 '21 at 23:56
`tensordot` reshapes and transposes its inputs, reducing the problem to a `np.dot`, with some further massaging after. `np.einsum` is much more general. `np.matmul/@` is **the** batched matrix multiplication function. — hpaulj, Apr 05 '21 at 00:57

hpaulj · Accepted Answer · 2021-04-05T01:17:52.210

In [455]: W = np.arange(1,7).reshape(2,3); A = np.arange(1,13).reshape(3,4)

Your calculation:

In [463]: x = np.einsum('ji, ik -> jik', W, A)
     ...: r = x.sum(axis=0)
In [464]: r
Out[464]: 
array([[  5,  10,  15,  20],
       [ 35,  42,  49,  56],
       [ 81,  90,  99, 108]])

As noted in a comment, einsum can perform the sum on j:

In [465]: np.einsum('ji, ik -> ik', W, A)
Out[465]: 
array([[  5,  10,  15,  20],
       [ 35,  42,  49,  56],
       [ 81,  90,  99, 108]])

And since j only occurs in A, we can sum on A first:

In [466]: np.sum(W,axis=0)[:,None]*A
Out[466]: 
array([[  5,  10,  15,  20],
       [ 35,  42,  49,  56],
       [ 81,  90,  99, 108]])

This doesn't involve a sum-of-products, so isn't matrix multiplication.

Or doing the sum after multiplication:

In [475]: (W[:,:,None]*A).sum(axis=0)
Out[475]: 
array([[  5,  10,  15,  20],
       [ 35,  42,  49,  56],
       [ 81,  90,  99, 108]])

Efficient way to compute an array matrix multiplication for a batch of arrays

1 Answers1