According to the documentation:
For N dimensions
dot
is a sum product over the last axis ofa
and the second-to-last ofb
:dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])
I would like to compute the sum product over the last axis of a
and the second-to-last of b
but without forming the Cartesian product over the remaining axes because the remaining axes are of the same shape. Let me illustrate with an example:
a = np.random.normal(size=(11, 12, 13))
b = np.random.normal(size=(11, 12, 13, 13))
c = np.dot(a, b)
c.shape # = (11, 12, 11, 12, 13)
But I would like the shape to be (11, 12, 13)
. The desired effect can be achieved using broadcasting
c = np.sum(a[..., None] * b, axis=-2)
c.shape # = (11, 12, 13)
But my arrays are relatively large and I'd like to use the power of parallelised BLAS implementations which don't appear to be supported by np.sum
but are supported by np.dot
. Any ideas on how to achieve this?