13

I am taking an online course, and the following supposedly demonstrates that "NumPy arrays: contain only one type":

In [19]: np.array([1.0, "is", True])
Out[19]:
array(['1.0', 'is', 'True'],
dtype='<U32')

At first, I thought that the output was a form of error message, but this is not confirmed by a web search. In fact, I haven't come across an explanation....can anyone explain how to interpret the output?

Afternote: After reviewing the answers, the dtype page, and the numpy.array() page, it seems that dtype='<U32' would be more accurately described as dtype('<U32'). Is this correct? I seems so to me, but I'm a newbie, and even the numpy.array() page assigns a string to the dtype parameter rather than an actual dtype object.

Also, why does '<U32' specify a 32-character string when all of the elements are much shorter strings?

user36800
  • 2,019
  • 2
  • 19
  • 34
  • 1
    See [Numpy data type objects](https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html) – Barmar Jul 09 '19 at 03:13
  • 1
    The course wants you to see that while the input list contains a float, a string and a boolean, the resulting array just has strings. The float and boolean have converted to their string counterparts. – hpaulj Jul 09 '19 at 03:30
  • Thanks, Barmar and hpaulj. I actually did not notice that the output was an array of strings until you pointed that out. – user36800 Jul 09 '19 at 04:43

2 Answers2

12

It is fully explained in the manual:

Several kinds of strings can be converted. Recognized strings can be prepended with '>' (big-endian), '<' (little-endian), or '=' (hardware-native, the default), to specify the byte order.

[...]

The first character specifies the kind of data and the remaining characters specify the number of bytes per item, except for Unicode, where it is interpreted as the number of characters. The item size must correspond to an existing type, or an error will be raised. The supported kinds are

[...]

'U'        Unicode string

So, a little-endian Unicode string of 32 characters.

Community
  • 1
  • 1
Amadan
  • 191,408
  • 23
  • 240
  • 301
5

dtype='<U32' is a little-endian 32 character string.

The documentation on dtypes goes into more depth about each of the character.

'U' Unicode string

Several kinds of strings can be converted. Recognized strings can be prepended with '>' (big-endian), '<' (little-endian), or '=' (hardware-native, the default), to specify the byte order.

Examples:

dt = np.dtype('f8')   # 64-bit floating-point number
dt = np.dtype('c16')  # 128-bit complex floating-point number
dt = np.dtype('a25')  # 25-length zero-terminated bytes
dt = np.dtype('U25')  # 25-character string```
user36800
  • 2,019
  • 2
  • 19
  • 34
adlopez15
  • 3,449
  • 2
  • 14
  • 19
  • I really appreciate the examples. Frankly, I don't think I'm at a point where the cited documentation is meaningful. – user36800 Jul 09 '19 at 03:32