2
#Input: 
chararr = np.chararray((3, 5))
chararr[:] = 'a'
chararr

#Output: 
chararray([[b'a', b'a', b'a', b'a', b'a'],
   [b'a', b'a', b'a', b'a', b'a'],
   [b'a', b'a', b'a', b'a', b'a']], 
  dtype='|S1')

My question is where does that 'b' come from ... I got this from jupyter notebook and PyCharm

Maxim Shen
  • 33
  • 1
  • 5

2 Answers2

3

The b in front the string shows it is a byte literal. They are instance of byte types instead of str type and may only contain ASCII characters.

str literals are sequence of Unicode characters (UTF-16 or UTF-32).

byte literals are sequence of octets (ASCII).

Don't worry, they are not the part of actual string. See, b are not inside the quotes.

For details go to python's official website.

Abhishek Kashyap
  • 3,332
  • 2
  • 18
  • 20
3

In Python3, the default string type is unicode. Bytestrings are displayed with the b flag. Notice the <S1 dtype? That means bytes, <U1 is for unicode (that's true for both Py2 and Py3).

chararray has a unicode parameter.

In [161]: A=np.chararray((3,5),unicode=True)
In [162]: A[:]='a'
In [163]: A
Out[163]: 
chararray([['a', 'a', 'a', 'a', 'a'],
       ['a', 'a', 'a', 'a', 'a'],
       ['a', 'a', 'a', 'a', 'a']], 
      dtype='<U1')

If I did the same in Py2, I'd be seeing u'a'.

hpaulj
  • 221,503
  • 14
  • 230
  • 353