Questions tagged [python-unicode]

Python distinguishes between byte strings and unicode strings. *Decoding* transforms bytestrings to unicode; *encoding* transform unicode strings to bytes.

Python distinguishes between byte strings and unicode strings. Decoding transforms bytestrings to unicode; encoding transform unicode strings to bytes.

Remember: you decode your input to unicode, work with unicode, then encode unicode objects for output as bytes.

See the

1053 questions

votes

2 answers

Unicode Encode Error when writing pandas df to csv

I cleaned 400 excel files and read them into python using pandas and appended all the raw data into one big df. Then when I try to export it to a csv: df.to_csv("path",header=True,index=False) I get this error: UnicodeEncodeError: 'ascii' codec…

asked Jul 10 '15 at 02:09

collarblind

4,549
13
31
49

votes

2 answers

Google App Engine: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 48: ordinal not in range(128)

I'm working on a small application using Google App Engine which makes use of the Quora RSS feed. There is a form, and based on the input entered by the user, it will output a list of links related to the input. Now, the applications works fine for…

python google-app-engine unicode jinja2 python-unicode

asked Jan 03 '14 at 15:23

Manas Chaturvedi

5,210
18
52
104

votes

8 answers

String.maketrans for English and Persian numbers

I have a function like this: persian_numbers = '۱۲۳۴۵۶۷۸۹۰' english_numbers = '1234567890' arabic_numbers = '١٢٣٤٥٦٧٨٩٠' english_trans = string.maketrans(english_numbers, persian_numbers) arabic_trans = string.maketrans(arabic_numbers,…

python python-2.7 python-unicode

asked Aug 09 '12 at 07:56

Shahin

1,415
4
22
33

votes

5 answers

how to convert Python 2 unicode() function into correct Python 3.x syntax

I enabled the compatibility check in my Python IDE and now I realize that the inherited Python 2.7 code has a lot of calls to unicode() which are not allowed in Python 3.x. I looked at the docs of Python2 and found no hint how to upgrade: I don't…

python python-3.x python-unicode

asked Aug 01 '16 at 10:55

guettli

25,042
81
346
663

votes

2 answers

What's "ANSI_X3.4-1968" encoding?

See following output on my system: [STEP 101] # python3 -c 'import sys; print(sys.stdout.encoding)' ANSI_X3.4-1968 [STEP 102] # [STEP 103] #…

python python-3.x encoding character-encoding python-unicode

asked Feb 12 '18 at 09:27

pynexj

19,215
5
38
56

votes

1 answer

How to decode and encode Hebrew strings?

I am trying to encode and decode the Hebrew string "שלום". However, after encoding, I get gibberish: >>> word = "שלום" >>> word = word.decode('UTF-8') >>> word u'\u05e9\u05dc\u05d5\u05dd' >>> print word שלום >>> word = word.encode('UTF-8') >>>…

python python-unicode hebrew

asked Apr 24 '15 at 15:02

user1767774

1,775
3
24
32

votes

2 answers

python 2.7 lowercase

When I use .lower() in Python 2.7, string is not converted to lowercase for letters ŠČŽ. I read data from dictionary. I tried using str(tt["code"]).lower(), tt["code"].lower(). Any suggestions ?

python python-2.7 unicode lowercase python-unicode

asked Mar 30 '12 at 12:41

Yebach

1,661
8
31
58

votes

1 answer

Python-3 and \x Vs \u Vs \U in string encoding and why

Why do we have different byte oriented string representations in Python 3? Won't it be enough to have single representation instead of multiple? For ASCII range number printing a string shows a sequence starting with \x: In [56]: chr(128) Out[56]:…

python python-3.x unicode python-unicode unicode-string

asked Sep 09 '17 at 16:45

MaNKuR

2,578
1
19
31

votes

1 answer

Will a UNICODE string just containing ASCII characters always be equal to the ASCII string?

I noticed the following holds: >>> u'abc' == 'abc' True >>> 'abc' == u'abc' True Will this always be true or could it possibly depend on the system locale? (It seems strings are unicode in python 3: e.g. this question, but bytes in 2.x)

python python-2.7 unicode character-encoding python-unicode

asked Feb 20 '15 at 11:03

doctorlove

18,872
2
46
62

votes

1 answer

Why does ElementTree reject UTF-16 XML declarations with "encoding incorrect"?

In Python 2.7, when passing a unicode string to ElementTree's fromstring() method that has encoding="UTF-16" in the XML declaration, I'm getting a ParseError saying that the encoding specified is incorrect: >>> from xml.etree import ElementTree >>>…

python-2.7 unicode encoding elementtree python-unicode

asked Jun 04 '14 at 19:25

Henrik Heimbuerger

9,924
6
56
69

votes

3 answers

Load Python 2 .npy file in Python 3

I'm trying to load /usr/share/matplotlib/sample_data/goog.npy: datafile = matplotlib.cbook.get_sample_data('goog.npy', asfileobj=False) np.load(datafile) It's fine in Python 2.7, but raises an exception in Python 3.4: UnicodeDecodeError: 'ascii'…

python python-3.x numpy python-unicode

asked Jun 08 '14 at 10:21

Frozen Flame

3,135
2
23
35

votes

3 answers

Regex to Match Horizontal White Spaces

I need a regex in Python2 to match only horizontal white spaces not newlines. \s matches all whitespaces including newlines. >>> re.sub(r"\s", "", "line 1.\nline 2\n") 'line1.line2' \h does not work at all. >>> re.sub(r"\h", "", "line 1.\nline…

regex python-2.7 unicode python-unicode

asked Sep 07 '17 at 12:14

Memduh

votes

3 answers

Python3: UnicodeEncodeError: 'ascii' codec can't encode character '\xfc'

I'am trying to get running a very simple example on OSX with python 3.5.1 but I'm really stucked. Have read so many articles that deal with similar problems but I can not fix this by myself. Do you have any hints how to resolve this issue? I would…

python-3.x iso-8859-1 python-unicode

asked Aug 10 '16 at 20:15

Hans Bondoka

votes

1 answer

Display width of unicode strings in Python

How can I determine the display width of a Unicode string in Python 3.x, and is there a way to use that information to align those strings with str.format()? Motivating example: Printing a table of strings to the console. Some of the strings contain…

python string unicode width python-unicode

asked Mar 06 '14 at 13:05

Christian Aichinger

6,989
4
40
60

votes

6 answers

python url unquote followed by unicode decode

I have a unicode string like '%C3%A7%C3%B6asd+fjkls%25asd' and I want to decode this string. I used urllib.unquote_plus(str) but it works wrong. expected : çöasd+fjkls%asd result : Ã§Ã¶asd fjkls%asd double coded utf-8 characters(%C3%A7 and %C3%B6)…

url-encoding python-unicode

asked Feb 28 '11 at 07:29

user637287

Prev 1

…

70 71 Next