numpy.reciprocal returns different values when called repeatedly

Question

I have a numpy array ssh_sum:

>>> ssh_sum
array([[ 0.,  2.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  2.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.,  2.]])

I wanted to compute the element-wise reciprocal values in this array. Numpy returns different values when I call np.reciprocal repeatedly:

>>> ssh_sum
array([[ 0.,  2.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  2.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.,  2.]])
>>> np.reciprocal(ssh_sum, where=(ssh_sum > 0.))
array([[  6.90326535e-310,   5.00000000e-001,   1.00000000e+000,
          0.00000000e+000,   1.07034283e-296,   1.33666925e+241],
       [  4.74783847e-309,   1.45260789e-296,   1.00000000e+000,
          5.00000000e-001,   2.13436228e-287,  -3.13188338e-294],
       [  4.85105226e-309,   1.08690709e+171,   4.09521901e+149,
          1.00000000e+000,   2.82730247e-311,   5.00000000e-001]])
>>> np.reciprocal(ssh_sum, where=(ssh_sum > 0.))
array([[ inf,  0.5,  1. ,  inf,  inf,  inf],
       [ inf,  inf,  1. ,  0.5,  inf,  inf],
       [ inf,  inf,  inf,  1. ,  inf,  0.5]])
>>> np.reciprocal(ssh_sum, where=(ssh_sum > 0.))
array([[  6.90326535e-310,   5.00000000e-001,   1.00000000e+000,
          0.00000000e+000,   1.07034283e-296,   1.33666925e+241],
       [  4.74783847e-309,   1.45260789e-296,   1.00000000e+000,
          5.00000000e-001,   2.13436228e-287,  -3.13188338e-294],
       [  4.85105226e-309,   1.08690709e+171,   4.09521901e+149,
          1.00000000e+000,   2.82730247e-311,   5.00000000e-001]])
>>> np.reciprocal(ssh_sum, where=(ssh_sum > 0.))
array([[ inf,  0.5,  1. ,  inf,  inf,  inf],
       [ inf,  inf,  1. ,  0.5,  inf,  inf],
       [ inf,  inf,  inf,  1. ,  inf,  0.5]])

Any idea what's going on here? I'm using Python 3.4.5 and numpy 1.13.3.

If you need a workaround, there's always; `m = ssh_sum > 0 ; ssh_sum[m] = 1 / ssh_sum[m]` — cs95, Mar 24 '18 at 00:27
thanks @cᴏʟᴅsᴘᴇᴇᴅ your second comment saved the day — Hinton, Mar 24 '18 at 00:30
Can you upgrade to 1.14? If this is a bug in your version of numpy, or an install problem with your copy of numpy, who knows whether it will affect anything else you're relying on? — abarnert, Mar 24 '18 at 00:32
I just upgraded to 1.14, and the behavior is now consistent across calls, but instead of leaving 0 values at zero it's still setting them to some garbage like 4.85105226e-309 — Hinton, Mar 24 '18 at 00:32
Maybe it's my CPU --- Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz , microcode 0xb00001b — Hinton, Mar 24 '18 at 00:33
4.05e-309 is pretty close to 0—as in more than close enough for `np.isclose` with defaults. — abarnert, Mar 24 '18 at 00:37

Warren Weckesser · Accepted Answer · 2018-03-24T06:06:59.997

It is not just reciprocal; the issue occurs with any use of the where argument. I've been able to reproduce the issue with the master branch of numpy (np.__version__ is '1.15.0.dev0+c093997'), with functions such as abs, sign, add, subtract, etc.

If you read the docstrings of the numpy "ufuncs" carefully and interpret them correctly, you'll see that the behavior is not a bug. Here are the relevant descriptions from the numpy.reciprocal docstring:

out : ndarray, None, or tuple of ndarray and None, optional
    A location into which the result is stored. If provided, it must have
    a shape that the inputs broadcast to. If not provided or `None`,
    a freshly-allocated array is returned. A tuple (possible only as a
    keyword argument) must have length equal to the number of outputs.
where : array_like, optional
    Values of True indicate to calculate the ufunc at that position, values
    of False indicate to leave the value in the output alone.

Note, in particular:

where says "values of False indicate to leave the value in the output alone."
out says "If not provided or None, a freshly-allocated array is returned."

You did not provide an out argument, so a new array is allocated by your call to reciprocal. The contents of this array are not initialized; the array holds whatever happened to be in the allocated memory. When you use the where argument, only those positions in the output where where is True are assigned values. Positions where where is False are not touched, so they hold whatever random stuff was there when the array was allocated. For floating point output, the random stuff in the output might be 0.0, 4.85105226e-309, or any other random values.

To use the where argument the way you intended, you should also provide your own out argument, initialized with the values you want in the output where where is False. In your case, you should pass in an array of zeros:

In [84]: ssh_sum
Out[84]: 
array([[0., 2., 1., 0., 0., 0.],
       [0., 0., 1., 2., 0., 0.],
       [0., 0., 0., 1., 0., 2.]])

In [85]: out = np.zeros_like(ssh_sum)

In [86]: np.reciprocal(ssh_sum, where=ssh_sum > 0.0, out=out)
Out[86]: 
array([[0. , 0.5, 1. , 0. , 0. , 0. ],
       [0. , 0. , 1. , 0.5, 0. , 0. ],
       [0. , 0. , 0. , 1. , 0. , 0.5]])

In [87]: out
Out[87]: 
array([[0. , 0.5, 1. , 0. , 0. , 0. ],
       [0. , 0. , 1. , 0.5, 0. , 0. ],
       [0. , 0. , 0. , 1. , 0. , 0.5]])

This is awesome, thanks. I assume the reason I couldn't reproduce it was because my Mac automatically allocated a block of 0s. — cs95, Mar 24 '18 at 16:14

numpy.reciprocal returns different values when called repeatedly

1 Answers1

Linked