Questions tagged [python-unicode]

Python distinguishes between byte strings and unicode strings. *Decoding* transforms bytestrings to unicode; *encoding* transform unicode strings to bytes.

Python distinguishes between byte strings and unicode strings. Decoding transforms bytestrings to unicode; encoding transform unicode strings to bytes.

Remember: you decode your input to unicode, work with unicode, then encode unicode objects for output as bytes.

See the

1053 questions
-1
votes
1 answer

Troubles in printing unicode string that I have in byte format

Reading from a database I get the following value b'd\xe2\x80\x99int' How can I print it to get the string d’int (note that this is different from d'int)? I tried with print(b'd\xe2\x80\x99int'.decode('utf-8')) but I get the…
Nisba
  • 3,210
  • 2
  • 27
  • 46
-1
votes
1 answer

Python - isalpha() returns True on unicode modifiers

Why does u'\u02c7'.isalpha() return True, if symbol ˇ is not alphabetic? Does this method work properly only with ASCII chars?
Kostya
  • 33
  • 1
  • 7
-1
votes
1 answer

how to memset a unicode string in python 2.7

I have a unicode string f. I want to memset it to 0. print f should display null (\0) I am using ctypes.memset to achieve this - > >>> f > u'abc' > >>> print ("%s" % type(f)) > > >>> import ctypes > **>>>…
-1
votes
1 answer

Unicode -- Copyright Symbol

I'm trying to represent the copyright symbol © in Python. If I type © into python interactive terminal I get '\xc2\xa9'. This is 169 and 194 in hexadecimal. But if I look up the copyright symbol in the unicode table it's only 169. Python…
-1
votes
1 answer

pickling with unicode in Python3

I am trying to pickle a dictionary of the form {word : {docId : int}}. My code is below: def vocabProcess(documents): word_splitter = re.compile(r"\w+", re.VERBOSE) stemmer=PorterStemmer()# stop_words = set(stopwords.words('english')) …
Fancypants753
  • 429
  • 1
  • 6
  • 19
-1
votes
2 answers

TypeError: execv() arg 2 must contain only strings (subprocess and unicode)

I have this Python2.7 script which works if LANG != 'C': # -*- coding: utf-8 -*- from __future__ import absolute_import, division, unicode_literals, print_function import os import subprocess import sys print('LANG:…
guettli
  • 25,042
  • 81
  • 346
  • 663
-1
votes
1 answer

How to resolve UnicodeEncodeError storing data in json format

I have scraped data from a website but for some items it shows me below error: UnicodeEncodeError: 'ascii' codec can't encode character u'\2019' in position 4: ordinal not in range(128) I have even put "# -- coding: utf-8 --" at the top of the…
michael
  • 21
  • 8
-1
votes
2 answers

Python 3 UnicodeEncodeError (Apache)

With this code: #!/usr/bin/env python3 open("We’re-introducing-a-DNS-man.jpg", "wb") I get the error: UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' in position 2: ordinal not in range(128) The error only occurs when running the…
Sam Bull
  • 2,559
  • 1
  • 15
  • 17
-1
votes
1 answer

Python: UnicodeDecodeError: 'utf8'

I'm having problem to save accented letters. I'm using POSTGRESQL and Python 2.7 POSTGRESQL - ENCODING = 'LATIN1' I already added this line but does not worked! #!/usr/bin/python # -*- coding: UTF-8 -*- More about error…
Shinomoto Asakura
  • 1,473
  • 7
  • 25
  • 45
-1
votes
1 answer

Python 2.7 + Flask TypeError: 'unicode' object is not callable

I'm trying concatenate 2 unicode string but I get an error. Code: @app.route('/', methods = ['GET','POST']) def index(): form = forms.MyForm() rtv = [] text = u'' if request.method == 'POST': lat = form.latitude.data …
J. Raji
  • 143
  • 4
  • 14
-1
votes
1 answer

Python - Making words from characters separated by space

I have a JSON file which I converted to string to remove HTML tags, but the function returns unicode values as shown below: [u'', u'', u'', u'c', u'i', u's', u' ', u'b', u'y', u' ', u'd', u'e', u'l', u'o', u'i', u't', u't', u'e', u''] I want to…
Rishabh Rusia
  • 173
  • 2
  • 4
  • 19
-1
votes
1 answer

self.encoding_errors & I/O operation on closed file

The below problem is temporarily fixed. I figured out that the input file (csv) has special characters (e.g. Aimí©) that resulted in the error. I now manually change the characters (e.g. Aimí© --> Aime). Previous question: I am using unicodecsv…
Tommy N
  • 365
  • 1
  • 4
  • 12
-1
votes
2 answers

Beautiful Soup Returning Unwanted Characters

I'm using Beautiful Soup to scrape pages trying to get the height of certain athletes: req = requests.get(url) soup = BeautifulSoup(req.text, "html.parser") height = soup.find_all("strong") height = height[2].contents print height Unfortunately,…
CGul
  • 147
  • 1
  • 3
  • 11
-1
votes
1 answer

python3 bytes string encoding

i have this code: res = conn.getresponse() data = res.read() doc = xmltodict.parse(data) risultati = doc['result']['data'] mieiris = json.loads(risultati) for k in mieiris['Headword']['Component']: try: print(k['Text']) except…
Nik
  • 107
  • 1
  • 2
  • 10
-1
votes
2 answers

Unicode handling in python 2

>>> cmd="echo ö" >>> type(s1) >>> print s1 echo ö >>> chan.exec_command(cmd) I am getting a string with some unicode characters from an external application. How should I handle this string in my python code properly? I am getting exception…
Apoorv Gupta
  • 399
  • 1
  • 4
  • 16