7

I have a structured NumPy array:

a = numpy.zeros((10, 10), dtype=[
    ("x", int),
    ("y", str)])

I want to set values in a["y"] to either "hello" if the corresponding value in a["x"] is negative. As far as I can tell, I should be doing that like this:

a["y"][a["x"] < 0] = "hello"

But this seems to change the values in a["x"]! What is the problem with what I'm doing, and how else should I do this?

Matthew
  • 2,232
  • 4
  • 23
  • 37
  • What versio of numpy are you using (`print numpy.__version__`)? I do not see the bleeding of characters into the integer field, but that would be a serious bug. – Jaime Aug 09 '14 at 15:49
  • 1.8.1, and yeah, it would be. – Matthew Aug 09 '14 at 15:50
  • `a = numpy.zeros((10, 10), dtype=[("x", int), ("y", str)]); a["x"][:] = numpy.random.randint(-10, 10, (10, 10)); print a["x"]; a["y"][a["x"] < 0] = "hello"; print a["x"]` - produces two different printed outputs. – Matthew Aug 09 '14 at 15:52
  • Changing `str` to `"a"` also reproduces it. – Matthew Aug 09 '14 at 15:55
  • 1
    I have filed a bug report, see [here](https://github.com/numpy/numpy/issues/4955), the problem seems to be that your string fields are being created with length 0. – Jaime Aug 09 '14 at 16:06

1 Answers1

6

First of all, in numpy structured arrays, when you specify datatype as str numpy assumes it to be a 1 character string.

>>> a = numpy.zeros((10, 10), dtype=[
        ("x", int), 
        ("y", str)])

>>> print a.dtype
dtype([('x', '<i8'), ('y', 'S')])

As a result the values you enter get truncated to 1 character.

>>> a["y"][0][0] = "hello"
>>> print a["y"][0][0]
h

Hence use data type as a10, Where 10 being the max length of your string.

Refer this link, which specifies more definitions for other data structures.

Secondly your approach seems correct to me.

Inititating a structured numpy array with datatype int and a10

>>> a = numpy.zeros((10, 10), dtype=[("x", int), ("y", 'a10')])

Filling it with random numbers

>>> a["x"][:] = numpy.random.randint(-10, 10, (10,10))
>>> print a["x"]
 [[  2  -4 -10  -3  -4   4   3  -8 -10   2]
 [  5  -9  -4  -1   9 -10   3   0  -8   2]
 [  5  -4 -10 -10  -1  -8  -1   0   8  -4]
 [ -7  -3  -2   4   6   6  -8   3  -8   8]
 [  1   2   2  -6   2  -9   3   6   6  -6]
 [ -6   2  -8  -8   4   5   8   7  -5  -3]
 [ -5  -1  -1   9   5  -7   2  -2  -9   3]
 [  3 -10   7  -8  -4  -2  -4   8   5   0]
 [  5   6   5   8  -8   5 -10  -6  -2   1]
 [  9   4  -8   6   2   4 -10  -1   9  -6]]

Applying your filtering

>>> a["y"][a["x"]<0] = "hello"
>>> print a["y"]
[['' 'hello' 'hello' 'hello' 'hello' '' '' 'hello' 'hello' '']
 ['' 'hello' 'hello' 'hello' '' 'hello' '' '' 'hello' '']
 ['' 'hello' 'hello' 'hello' 'hello' 'hello' 'hello' '' '' 'hello']
 ['hello' 'hello' 'hello' '' '' '' 'hello' '' 'hello' '']
 ['' '' '' 'hello' '' 'hello' '' '' '' 'hello']
 ['hello' '' 'hello' 'hello' '' '' '' '' 'hello' 'hello']
 ['hello' 'hello' 'hello' '' '' 'hello' '' 'hello' 'hello' '']
 ['' 'hello' '' 'hello' 'hello' 'hello' 'hello' '' '' '']
 ['' '' '' '' 'hello' '' 'hello' 'hello' 'hello' '']
 ['' '' 'hello' '' '' '' 'hello' 'hello' '' 'hello']]

Verifying a["x"]

>>> print a["x"]
[[  2  -4 -10  -3  -4   4   3  -8 -10   2]
 [  5  -9  -4  -1   9 -10   3   0  -8   2]
 [  5  -4 -10 -10  -1  -8  -1   0   8  -4]
 [ -7  -3  -2   4   6   6  -8   3  -8   8]
 [  1   2   2  -6   2  -9   3   6   6  -6]
 [ -6   2  -8  -8   4   5   8   7  -5  -3]
 [ -5  -1  -1   9   5  -7   2  -2  -9   3]
 [  3 -10   7  -8  -4  -2  -4   8   5   0]
 [  5   6   5   8  -8   5 -10  -6  -2   1]
 [  9   4  -8   6   2   4 -10  -1   9  -6]]
Raghav RV
  • 3,938
  • 2
  • 22
  • 27
  • Ahh, the comment about data type is super helpful! Thanks! As for the mutation of `a["x"]`, perhaps that's related? The values all changed to values within the ASCII values of lowercase letters, so maybe some kind of weird overflow? I'll try changing the type and seeing what happens. – Matthew Aug 09 '14 at 14:59
  • Yep, this seems to have resolved my problem! Thanks! – Matthew Aug 09 '14 at 15:00