What happened to numpy.chararray

Question

#Input: 
chararr = np.chararray((3, 5))
chararr[:] = 'a'
chararr

#Output: 
chararray([[b'a', b'a', b'a', b'a', b'a'],
   [b'a', b'a', b'a', b'a', b'a'],
   [b'a', b'a', b'a', b'a', b'a']], 
  dtype='|S1')

My question is where does that 'b' come from ... I got this from jupyter notebook and PyCharm

Abhishek Kashyap · Answer 1 · 2016-07-27T20:31:49.410

3

The b in front the string shows it is a byte literal. They are instance of byte types instead of str type and may only contain ASCII characters.

str literals are sequence of Unicode characters (UTF-16 or UTF-32).

byte literals are sequence of octets (ASCII).

Don't worry, they are not the part of actual string. See, b are not inside the quotes.

For details go to python's official website.

edited Jul 27 '16 at 20:31

answered Jul 27 '16 at 20:16

Abhishek Kashyap

3,332
2
18
20

Thanks, that helps. – Maxim Shen Jul 28 '16 at 15:52
@MaximShen If you feel so, can you make this answer accepted? – Abhishek Kashyap Jan 22 '19 at 15:15

score 3 · Accepted Answer · answered Jul 27 '16 at 20:28

In Python3, the default string type is unicode. Bytestrings are displayed with the b flag. Notice the <S1 dtype? That means bytes, <U1 is for unicode (that's true for both Py2 and Py3).

chararray has a unicode parameter.

In [161]: A=np.chararray((3,5),unicode=True)
In [162]: A[:]='a'
In [163]: A
Out[163]: 
chararray([['a', 'a', 'a', 'a', 'a'],
       ['a', 'a', 'a', 'a', 'a'],
       ['a', 'a', 'a', 'a', 'a']], 
      dtype='<U1')

If I did the same in Py2, I'd be seeing u'a'.

What happened to numpy.chararray

2 Answers2

Linked