3

I've got a very simple task and numpy is doing something I don't understand. I'm trying to replace elements of an array that meet some criteria with a number between 0 and 1, and numpy is converting them all into zeroes. For example:

In [1]: some_array = np.array([0,0,0,1,0,1,1,1,0])

In [2]: nonzero_idxs = np.where(some_array != 0)[0]

In [3]: nonzero_idxs
Out[3]: array([3, 5, 6, 7])

In [4]: some_array[nonzero_idxs] = 99

In [5]: some_array
Out[5]: array([ 0,  0,  0, 99,  0, 99, 99, 99,  0])

In [6]: some_array[nonzero_idxs] = 0.2

In [7]: some_array[nonzero_idxs]
Out[7]: array([0, 0, 0, 0])

In [8]: some_array[nonzero_idxs] == 0
Out[8]: array([ True,  True,  True,  True], dtype=bool)

As the example above shows, replacing values with some arbitrary value works as expected, but if you try to replace it with a decimal, it turns it into a zero (and they don't just look like zeros when you print the array, they're evaluating as equal to zero). The same behavior happens when I try to go about this in other ways, for example, by using np.place.

I'm doing this within iPython on the terminal, if that makes any difference. Can someone explain what's happening here, and how to avoid it? Apologies if this is a duplicate.

Teachey
  • 549
  • 7
  • 18
  • OK yes, I just tried the same test a second ago... it comes out as 5. So.... why is it forcing them to be integers? That seems like strange behavior to me. – Teachey Apr 11 '19 at 14:32
  • Thanks @pault. The answer is buried in the responses to that question. If the array contains nothing but integers, it refuses to let you replace an element with a float. You have to do something like some_array = some_array.astype(float). I never would have figured that on my own. – Teachey Apr 11 '19 at 14:35
  • Perhaps that question should be closed as a duplicate of this one. – pault Apr 11 '19 at 14:39

3 Answers3

5
some_array = np.array([0,0,0,1,0,1,1,1,0]).astype(float) 

Using numpy array as a float will solve your issue. By default it seems its integer and just shovel down the value to zero.

nonzero_idxs = np.where(some_array != 0)[0]
some_array[nonzero_idxs] = 0.2
# output: array([0. , 0. , 0. , 0.2, 0. , 0.2, 0.2, 0.2, 0. ])
NoorJafri
  • 1,787
  • 16
  • 27
4

The reason is simple.

Unlike Python lists, numpy arrays can contain only elements of a certain type and its subtypes.

When you defined some_array, it was created as an int32 array. Therefore, when you tried to assign a value of type float to it, it was coerced to an int, and int(0.2) == 0.

Compare with the case where you specify that the array should contain float32:

some_array = np.array([0, 0, 0, 1, 0, 1, 1, 1, 0], dtype=np.float)
nonzero_idxs = np.where(some_array != 0)[0]
some_array[nonzero_idxs] = 0.2
some_array

Output:

array([0. , 0. , 0. , 0.2, 0. , 0.2, 0.2, 0.2, 0. ])
gmds
  • 19,325
  • 4
  • 32
  • 58
0

The problem is dtype. It is int64, and you need to change it to float64 (or just float):

some_array = some_array.astype('float')

Ildar Akhmetov
  • 1,331
  • 13
  • 22
  • Side note, it won't be `int64` on Windows - it will be `int32`. This can lead to unexpected and silent overflows in things like `sum()` so it's worth keeping in mind. – roganjosh Apr 11 '19 at 14:31
  • Just tried it on Windows with Jupyter, and it's `int64` by default. Maybe it depends on the architecture. – Ildar Akhmetov Apr 11 '19 at 14:32
  • Jupyter is intervening then. It should not be `int64` in Python. What version of Windows? Maybe they changed the default int type in Windows 10 – roganjosh Apr 11 '19 at 14:33