Einsum very slow to compute c distance

Question

I'm computing mahalanobis distance via np.einsum:

np.einsum('nj,jk,nk->n', delta, VI, delta)

where VI, the covariance matrix's inverse, is 783 x 783 and delta is 6000 x 783. This line takes 10s on my 2016 Macbook Pro to execute. How can I make this go faster?

I have to compute this line 200k to 300k times. Vectorization may not be an option because VI is different for each class.

When `optimize=True` einsum computes the greedy path. When `optimize='optimal'` it takes longer but computes a better calculation path (might the same in this case). You can pre-compute this `path`, however, if the dimension of your arrays don't change and supply `optimize=path`, which is probably the best use case here so it doesn't need to compute the same thing over and over. See the last example in numpy einsum docs. — Attack68, Mar 08 '19 at 17:04

Nils Werner · Accepted Answer · 2019-03-07T18:45:18.910

No need for Einsum, you can use dot and elementwise products, and a sum instead:

VI = np.random.rand(783, 783)
delta = np.random.rand(6000, 783)

%timeit np.einsum('nj,jk,nk->n', delta, VI, delta)
# 7.05 s ± 89.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit np.sum((delta @ VI) * delta, axis=-1)
# 90 ms ± 4.72 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

out_1 = np.einsum('nj,jk,nk->n', delta, VI, delta)
out_2 = np.sum((delta @ VI) * delta, axis=-1)
np.allclose(out_1, out_2)
# True

How did I arrive at this?

nj,jk->nk is a dotproduct:

tmp_1 = np.einsum('nj,jk->nk', delta, VI)
tmp_2 = delta @ VI
np.allclose(tmp_1, tmp_2)  # True

nk,nk->nk is an elementwise product:

tmp_3 = np.einsum('nk,nk->nk', tmp_1, delta)
tmp_4 = tmp_2 * delta
np.allclose(tmp_3, tmp_4)  # True

and nk->n is a sum over the last axis:

tmp_5 = np.einsum('nk->n', tmp_3)
tmp_6 = np.sum(tmp_4, axis=-1)
np.allclose(tmp_5, tmp_6)  # True

Vectorizing `VI`

You will notice that vectorizing VI along the first axis will just work:

# Vectorized `VI`
nd_VI = np.random.rand(3, 783, 783)
# Unvectorized `VI`, for comparison
VI = nd_VI[0, :]
delta = np.random.rand(6000, 783)

out = np.sum((delta @ VI) * delta, axis=-1)
out.shape
# (6000,)

nd_out = np.sum((delta @ nd_VI) * delta, axis=-1)
nd_out.shape
# (3, 6000)

# Result of vectorized and unvectorized `IV` are the same
np.allclose(out, nd_out[0, :])
# True

Vectorizing `VI` and `delta` elementwise

Same with vectorizing both VI and delta, just add the same number of elements to the beginning of VI and delta

# Vectorized `VI`
nd_VI = np.random.rand(3, 783, 783)
# Unvectorized `VI`, for comparison
VI = nd_VI[0, ...]
# Vectorized `delta`
nd_delta = np.random.rand(3, 6000, 783)
# Unvectorized `delta`, for comparison
delta = nd_delta[0, ...]

out = np.sum((delta @ VI) * delta, axis=-1)
out.shape
# (6000,)

nd_out = np.sum((nd_delta @ nd_VI) * nd_delta, axis=-1)
nd_out.shape
# (3, 6000)

# Result of vectorized and unvectorized `IV` are the same
np.allclose(out, nd_out[0, ...])
# True

Vectorizing `VI` and `delta` independently

Or, if you want to calculate the Mahalanobis distance of every element in VI with every possible element in delta, you can use broadcasting:

# Vectorized `VI`, note the extra empty dimension (where `delta` has 3)
nd_VI = np.random.rand(4, 1, 783, 783)
# Unvectorized `VI`, for comparison
VI = nd_VI[0, 0, ...]
# Vectorized `delta`, note the extra empty dimension (where `VI` has 4)
nd_delta = np.random.rand(1, 3, 6000, 783)
# Unvectorized `delta`, for comparison
delta = nd_delta[0, 0, ...]

out = np.sum((delta @ VI) * delta, axis=-1)
out.shape
# (6000,)

nd_out = np.sum((nd_delta @ nd_VI) * nd_delta, axis=-1)
nd_out.shape
# (4, 3, 6000)

# Result of vectorized and unvectorized `IV` are the same
np.allclose(out, nd_out[0, 0, ...])
# True

Think, we can go a bit more with `np.einsum('nk,nk->n', delta.dot(VI),delta)`. — Divakar, Mar 07 '19 at 16:50
@NilsWerner Is nd_VI holding 3 different covariance matrices? — saad, Mar 07 '19 at 16:57
I meant `np.einsum('nk,nk->n', delta.dot(VI),delta)` could be a bit more performant than using `np.sum(..)` version. — Divakar, Mar 07 '19 at 16:59
`np.einsum_path('nj,jk,nk->n',delta,VI,delta,optimize=True) ` discovers this same path, 'jk,nj->kn' followed by 'kn,nk->n' — hpaulj, Mar 07 '19 at 17:23
@NilsWerner Shouldn't delta also be in 3 dimensions, so it holds the corresponding values for nd_VI? — saad, Mar 07 '19 at 18:22
@saad I thought you only changed `VI`, but held `delta` constant? If that isn't the case then it should have 3 dimensions, too, yes. — Nils Werner, Mar 07 '19 at 18:24
@NilsWerner Thank you. Is it possible for nd_delta's slices to have different shapes? — saad, Mar 07 '19 at 18:32

Einsum very slow to compute c distance

1 Answers1

How did I arrive at this?

Vectorizing VI

Vectorizing VI and delta elementwise

Vectorizing VI and delta independently

Vectorizing `VI`

Vectorizing `VI` and `delta` elementwise

Vectorizing `VI` and `delta` independently