Einsum optimize fails for basic operation

Question

With the recent update to Numpy (1.14), I found that it breaks my entire codebase. This is based on changing the default numpy einsum optimize argument from False to True.

As a result, the following basic operation now fails:

a = np.random.random([50, 2, 2])
b = np.random.random([50, 2])

np.einsum('bdc, ac -> ab', a, b, optimize=True)

with the following error trace:

ValueError                                Traceback (most recent call last)
<ipython-input-71-b0f9ce3c71a3> in <module>()
----> 1 np.einsum('bdc, ac -> ab', a, b, optimize=True)

C:\ProgramData\Anaconda3\lib\site-packages\numpy\core\einsumfunc.py in 
einsum(*operands, **kwargs)
   1118 
   1119             # Contract!
-> 1120             new_view = tensordot(*tmp_operands, axes=
(tuple(left_pos), tuple(right_pos)))
   1121 
   1122             # Build a new view if needed

C:\ProgramData\Anaconda3\lib\site-packages\numpy\core\numeric.py in 
tensordot(a, b, axes)
   1301     oldb = [bs[axis] for axis in notin]
   1302 
-> 1303     at = a.transpose(newaxes_a).reshape(newshape_a)
   1304     bt = b.transpose(newaxes_b).reshape(newshape_b)
   1305     res = dot(at, bt)

 ValueError: axes don't match array

The operation I'm requesting from einsum seems very simple... so why does it fail? If I set "optimize=False", it works just fine.

I tried exploring with einsum_path but the resulting path info was the same with and without optimization.

Can it be because the code now tries to use BLAS "whenever possible" - see https://github.com/numpy/numpy/blob/master/doc/release/1.14.0-notes.rst ? Maybe it would be better to open an issue on https://github.com/numpy/numpy — AGN Gazer, Feb 11 '18 at 01:21
I think `path` is useful when there are 3 or more arguments, and there's a question of which `dot` contraction should be done first (to reduce the problem space fastest). This is a different sort of contraction problem. — hpaulj, Feb 11 '18 at 01:58
I need to pay more attention to this `optimize` parameter when recommending `einsum`, especially when using `timeit` tests. It's harder to know whether it's using the original `c_einsum` or one of these supposed optimizations. — hpaulj, Feb 11 '18 at 17:40

hpaulj · Answer 1 · 2018-02-11T23:16:17.410

2

In [40]: a=np.ones((50,2,2),int); b=np.ones((50,2),int)
In [41]: np.einsum('bdc,ac->ab', a, b)
... 
ValueError: axes don't match array

I don't see what optimization has to do with this error.

For the first argument b,d,c are 50,2,2. For the second a,c are 50,2. The result should be 50,50. But what happened to d?

In [43]: np.einsum('bdc,ac->abd', a, b).shape
Out[43]: (50, 50, 2)

Oops:

In [49]: np.einsum('bdc,ac->ab', a, b, optimize=False).shape
Out[49]: (50, 50)

So it's summing on d.

Note the error - with optimization, it uses tensordot (transpose plus dot), rather than the original einsum nditer approach.

c_einsum is the one that can handle that missing d:

# If no optimization, run pure einsum
if optimize_arg is False:
    return c_einsum(*operands, **kwargs)

Tried some timings:

Two step calculation with the default optimize:

In [64]: timeit np.einsum('abd->ab', np.einsum('bdc,ac->abd', a, b))
288 µs ± 518 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

The desired c_einsum is faster

In [65]: timeit np.einsum('bdc,ac->ab', a, b, optimize=False)
170 µs ± 83.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In fact c_einsum is faster even when the tensordot version works

In [67]: timeit np.einsum('bdc,ac->abd', a, b,optimize=False)
73.1 µs ± 46.6 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [68]: timeit np.einsum('bdc,ac->abd', a, b,optimize=True)
207 µs ± 6.97 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

This example might be too small to show of the advantages of tensordot/blas.

Looks like this has been raised on github - both the failures, and the slower 'optimize' speed: https://github.com/numpy/numpy/issues/10343 "einsum broadcast regression (with optimize=True)"

edited Feb 11 '18 at 23:16

answered Feb 11 '18 at 01:23

hpaulj

221,503
14
230
353

Because in v1.13 the optimize flag defaulted to False. If you run this code, with the flag set manually, it does work: np.einsum('bdc,ac->ab', a, b, optimize=False) gives array([[4, 4, 4, ..., 4, 4, 4], [4, 4, 4, ..., 4, 4, 4], [4, 4, 4, ..., 4, 4, 4], ..., [4, 4, 4, ..., 4, 4, 4], [4, 4, 4, ..., 4, 4, 4], [4, 4, 4, ..., 4, 4, 4]]) – sharkmas Feb 11 '18 at 01:27
Similarly, in v1.13, if you manually set optimize=True, you get the error. Thus, I feel the bug lies with the optimization function? Can anyone confirm this? – sharkmas Feb 11 '18 at 01:28
I don't have 1.13 installed any more, so can't test that behavior. I should probably delete this answer. I studied `einsum` in depth back several versions, but haven't paid much attention to the recent optimization. – hpaulj Feb 11 '18 at 01:28
Ok, sure. Thanks for your help. I believe this issue still stands then. – sharkmas Feb 11 '18 at 01:32
I just realized that with optimization it doesn't use the `nditer` approach that I studied - it uses `tensordot`. – hpaulj Feb 11 '18 at 01:34
_"Similarly, in v1.13, if you manually set optimize=True, you get the error."_ - **not true!** I will post an answer just to prove this and I will delete it in 10 min. – AGN Gazer Feb 11 '18 at 01:34
Ah, thanks for clarifying! It seems the devs are aware of this bug though: https://github.com/numpy/numpy/issues/10343 and a possible fix: https://github.com/numpy/numpy/pull/10352. Therefore, for now, I'll just manually disable optimization and wait until v1.15 when it optimize will either work / be disabled by default again. – sharkmas Feb 11 '18 at 01:40

Einsum optimize fails for basic operation

1 Answers1

Linked