Numexpr for Python returning all zero arrays on certain hardware configurations

Question

I've recently discovered what appears to be a bug in Numexpr. Although I've already opened an issue on their Git hub, I figured I would avail myself of the collective wisdom here as well.

In a nutshell, evaluate sometimes (unpredictably) returns incorrect results when doing a straightforward array operation. The bug, which can be reproduced by the Python code below, results in a zero array being returned rather than the correct result. Although the sample code shows a multiplication, this bug has manifested for us on addition and exponentiation as well. Notably, there are no errors or warnings that are raised by Numexpr, the computational load appears normal (i.e. the RAM and CPU are taxed as expected when monitoring task manager), and the correct shape array is returned. It was a rather insidious bug to isolate for those reasons! In our tests, this bug has only manifested in the following hardware builds:

Windows Server 2012 r2, Intel Xeon 2680 v3, 2 processors, 48 logical cores
Windows 8.1, Intel Xeon 2690, 1 processor, 24 logical cores

In all the many thousands of runs of our software completed on our Windows 7, 64 bit, Intel i7 machines, this has never manifested. Furthermore, we have run the attached code many times (with bigger arrays and more iterations) and have not seen the error on the Windows 7, i7 machines. The Xeon computers, though, manifest it regularly. Unfortunately we don't have any other builds on which to test.

Other items of note:

We are running from the WinPython distribution 3.4.3.6.
We have not invoked any supporting Numexpr functions, just evaluate... so we are using its default settings.
The version of Numexpr is 2.4.4, as included in WinPython 3.4.3.6

Sample Code:

import numpy as np
import numexpr as ne

x = np.ones(1e6)
y = np.ones(1e6)

for ii in range(1000):

    rr = ne.evaluate('x * y')
    test = np.all(rr == 0)
    if test:
        print('Gotcha! %d' % ii)

print('Complete!')

What numexpr version are you running? 2.4.4 in particular was known to be flaky on Windows. — DSM, Apr 07 '16 at 18:29
2.4.4, sorry. Updated question. Ahh, flaky on Windows? Can you direct me to documentation perchance? Any bit of insight into this will help! — Trekkie, Apr 07 '16 at 18:31
See [here](https://github.com/pydata/numexpr/issues/188) and [here](https://github.com/pydata/numexpr/issues/185). This caused so many problems upstream that we blacklisted it in pandas [here](https://github.com/pydata/pandas/issues/12489)-- see the list of similar problems at the bottom of the main [issue](https://github.com/pydata/pandas/issues/12023). — DSM, Apr 07 '16 at 18:36
Oh wow. It looks like this is exactly my problem... I even demonstrated it with a similar simple example. So yeah, 2.4.4 appears to be critically broken... the threads I've just now been reading suggest that the new build works. At this point, though, I don't know if I can trust it. We basically just use what's in the WinPython distribution... it becomes really problematic to start making "homemade" distributions of our own. Looks like I may just have to since the computational speed gains are so crucial. Pardon the nerd humor, but I am a sad panda now. — Trekkie, Apr 07 '16 at 18:43
Just upgrade your WinPython. 3.4.4.1 has 2.4.6, which doesn't have this problem. — DSM, Apr 07 '16 at 18:48
That'll be my first try, but that may break something else. :) Guess I'll cross that bridge when I get there! Thanks again for your help... this bug was quite insidious even to identify for us. Took several man hours to identify since it happens so randomly and since numexpr is sprinkled throughout our code. I really appreciate it. — Trekkie, Apr 07 '16 at 18:50

Numexpr for Python returning all zero arrays on certain hardware configurations

0 Answers0