2

Is there away to multiply two arrays and sum along an axis (or multiple axes) without allocating extra memory?

In this example:

import numpy as np

A = np.random.random((10, 10, 10))
B = np.random.random((10, 10, 10))

C = np.sum(A[:, None, :, :, None] * B[None, :, None, :, :], axis=(-1,-2))

When computing C, an intermediate matrix of size 10x10x10x10x10 is created only to be collapsed immediately. Is there a way to avoid this in numpy?

Tohiko
  • 1,860
  • 2
  • 18
  • 26
  • Do you want `C = A @ B.T`? – mozway Apr 06 '23 at 14:35
  • @mozway, not always, but I am realizing the example is too simple. I updated to be more requiring of broadcasting. – Tohiko Apr 06 '23 at 14:43
  • As the comments show a sum-of-products can often be cast as a `dot` of some sort. But more generally if you want to avoid this kind of large intermediate result, and still want compiled speed, you may have to use `numba`.. – hpaulj Apr 06 '23 at 15:16
  • @hpaulj couldn't this be achieved with `einsum`? `np.einsum('ikl,jlm->ijk', A, B)` – mozway Apr 06 '23 at 15:19
  • 1
    @mozway, I meant `of some sort` to include `einsum`. That's the most general sum-of-products tool. – hpaulj Apr 06 '23 at 15:29

1 Answers1

4

This looks like a dot product with the transpose of the second array:

C = A @ B.T

NB. The operation in the original question was: C = np.sum(A[:, None, :] * B[None, :, :], axis=-1).

Quick check:

C1 = np.sum(A[:, None, :] * B[None, :, :], axis=-1)
C2 = A @ B.T

assert np.allclose(C1, C2)

You can generalize with einsum:

np.einsum('ikl,jlm->ijk', A, B)

Quick check:

A = np.random.random((2, 3, 4))
B = np.random.random((2, 4, 5))

#             i    j   k  l    m         i   j    k   l  m
C1 = np.sum(A[:, None, :, :, None] * B[None, :, None, :, :], axis=(-1,-2))
C2 = np.einsum('ikl,jlm->ijk', A, B)

assert np.allclose(C1, C2)
mozway
  • 194,879
  • 13
  • 39
  • 75