Python: Taking a the outer product of each row of matrix by itself, taking the sum then returning a vector of sums

Question

Say I have a matrix A of dimension N by M.

I wish to return an N dimensional vector V where the nth element is the double sum of all pairwise product of the entries in the nth row of A.

Formula

In loops, I guess I could do:

V = np.zeros(A.shape[0])
for n in range(A.shape[0]):
    for i in range(A.shape[1]):
        for j in range(A.shape[1]):
            V[n] += A[n,i] * A[n,j]

I want to vectorise this and I guess I could do:

V_temp = np.einsum('ij,ik->ijk', A, A)
V = np.einsum('ijk->i', A)

But I don't think this is very memory efficient way as the intermediate step V_temp is unnecessarily storing the whole outer products when all I need are sums. Is there a better way to do this?

Thanks

`V_temp` isn't used, should the next line use it instead of `A`? — Tadhg McDonald-Jensen, Dec 12 '17 at 02:29

score 2 · Accepted Answer · answered Dec 12 '17 at 02:14

2

You can use

V=np.einsum("ni,nj->n",A,A)

answered Dec 12 '17 at 02:14

Roun

1,449
1
18
25

ah you beat me to it – James Lim Dec 12 '17 at 02:27
Thanks Roun, exactly what I needed! – chanyoungs Dec 12 '17 at 02:34

Paul Panzer · Answer 2 · 2017-12-12T03:00:49.230

2

You are actually calculating

A.sum(-1)**2

In other words, the sum over an outer product is just the product of the sums of the factors.

Demo:

A = np.random.random((1000,1000))
np.allclose(np.einsum('ij,ik->i', A, A), A.sum(-1)**2)
# True
t = timeit.timeit('np.einsum("ij,ik->i",A,A)', globals=dict(A=A,np=np), number=10)*100; f"{t:8.4f} ms"
# '948.4210 ms'
t = timeit.timeit('A.sum(-1)**2', globals=dict(A=A,np=np), number=10)*100; f"{t:8.4f} ms"
# '  0.7396 ms'

edited Dec 12 '17 at 03:00

answered Dec 12 '17 at 02:35

Paul Panzer

51,835
3
54
99

1

This is the best answer. `sum_i(sum_j(a_i * a_j))` is the same as `sum_i(a_i) * sum_j(a_j)` – James Lim Dec 12 '17 at 02:45

James Lim · Answer 3 · 2017-12-12T02:26:30.073

0

Perhaps you can use

np.einsum('ij,ik->i', A, A)

or the equivalent

np.einsum(A, [0,1], A, [0,2], [0])

On a 2015 Macbook, I get

In [35]: A = np.random.rand(100,100)

In [37]: %timeit for_loops(A)
640 ms ± 24.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [38]: %timeit np.einsum('ij,ik->i', A, A)
658 µs ± 7.25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [39]: %timeit np.einsum(A, [0,1], A, [0,2], [0])
672 µs ± 19.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

edited Dec 12 '17 at 02:26

answered Dec 12 '17 at 02:17

James Lim

12,915
4
40
65

Thanks James, I appreciate it! – chanyoungs Dec 12 '17 at 02:34

Python: Taking a the outer product of each row of matrix by itself, taking the sum then returning a vector of sums

3 Answers3