Questions tagged [python-unicode]

Python distinguishes between byte strings and unicode strings. *Decoding* transforms bytestrings to unicode; *encoding* transform unicode strings to bytes.

Python distinguishes between byte strings and unicode strings. Decoding transforms bytestrings to unicode; encoding transform unicode strings to bytes.

Remember: you decode your input to unicode, work with unicode, then encode unicode objects for output as bytes.

See the

1053 questions
-1
votes
1 answer

python 2.7 wand: UnicodeDecodeError: (Error in get_font_metrics)

I am getting this error "UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 17: ordinal not in range(128)" when I try to merge this image "La Pocatière.png". Python 2.7.11 bg_img = Image(filename='C:/Pocatière.png') …
mrzoogle
  • 111
  • 3
  • 14
-1
votes
2 answers

Python 3 UnicodeEncodeError for characters and smileys in Tweets

I'm making a Twitter API, I get tweets about a specific word (right now it's 'flafel'). Everything is fine except this tweet b'And when I\'m thinking about getting the chili sauce on my flafel and the waitress, a Pinay, tells me not to get it cos…
GLHF
  • 3,835
  • 10
  • 38
  • 83
-1
votes
2 answers

UnicodeDecodeError: 'ascii' codec can't decode byte 0x92?

So I am trying to read data off a .txt file and then find the most common 30 words and print them out. However, whenever I'm reading my txt file, I receive the error: "UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 338:…
-1
votes
2 answers

unicode error printing \u2002 using Python 3

I am getting the error that Python can't decode character \u2002 when trying to print a block of text: UnicodeEncodeError: 'charmap' codec can't encode character '\u2002' in position 355: character maps to What i don't understand is…
kyrenia
  • 5,431
  • 9
  • 63
  • 93
-1
votes
2 answers

Unicode category for commas and quotation marks

I have this helper function that gets rid of control characters in XML text: def remove_control_characters(s): #Remove control characters in XML text t = "" for ch in s: if unicodedata.category(ch)[0] == "C": t += " " …
SANBI samples
  • 2,058
  • 2
  • 14
  • 20
-1
votes
1 answer

How to convert u'\xd0' to d0 in hex?

I got this simple but difficult problem in Python. For unknown reason with pypyjs, I got my binary buffer as u'\xd0\xcf\x11\xe0\xa1...'. By the look of it, I knew it would be alright if it is a binary stream of 'd0cf 11e0 a1...'. I wondered how do I…
chfw
  • 4,502
  • 2
  • 29
  • 32
-1
votes
2 answers

Unicode error in Python 3

I am trying to set a folder path as follows: folderpath = "C:\\Users\NY1\\Dropbox\\Research ideas\\Final Code\\Poject_name" and I am getting the following error: SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position…
NNsr
  • 1,023
  • 2
  • 11
  • 25
-1
votes
2 answers

how to get dictionary value as same using python?

Solved with your help #!/usr/bin/python # -*- coding: utf-8 -*- message = {'message1':'நாம்','message2':'செய்தி'} a={} for i in message.keys(): if "message" in i: a[i]=message[i] status="success" print a got…
the-run
  • 977
  • 1
  • 10
  • 21
-1
votes
1 answer

Remove Unicode values that have spaces between them

I have a file containing Unicode strings aligned line by line. ജുഗുപ്‌സയോ നീരസമോ പരിഹാസമോ ദ്യോതിപ്പിക്കുന്ന മുഖഭാവം വളവ്‌ വക്രത തിരിവ്‌ കോട്ടം നന്നേ ചെറുപ്രായത്തില്‍ അസാമന്യ ജീവിത വിജയം നേടുന്നയാള്‍ ഇന്റര്‍നെറ്റിലെ പ്രധാനപ്പെട്ട…
user2085779
-1
votes
1 answer

Arabic code point range in python

i have a code below which Liang Sun implemented #Created by Liang Sun in 2013 import re import collections import hashlib class Simhash(object): def __init__(self, value): self.f = 64 self.reg = ur'[\w\ufb50-\ufdff]' …
NZrMd
  • 25
  • 11
-1
votes
3 answers

Python doesn't save file with unicode characters

Python doesn't save the file with Hebrew characters. How do I fix this? (Python 2.7) The example image shows a file in the SPE IDE with a first line of heb = ["ד" ,"ג" ,"ב", "א", ...]
-2
votes
1 answer

can't encode character '\u0144' even using encoding=utf-8 in python3

I am trying to read some information from some .txt files, they are all in english and they do not have any other unicode character, the problem is that for an especific file it just crashed and do not show the information, the error is Traceback…
jhonny
  • 805
  • 4
  • 10
-2
votes
1 answer

UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 1: ordinal not in range(128)

I am using python = 3.6 Can't able to get solution for this ? Can anyone help me to get a solution for this issue!
-2
votes
1 answer

make unicode a string stored in a variable and then send it with telepot

Introduction I'm creating a scraper bot with telepot and selenium and when i get the text data that i need to send with the telegram bot it is unreadabl, because it contains unicode-escape characters (emoji) in a wrong format like: "hi I like this…
Leonardo Scotti
  • 1,069
  • 8
  • 21
-2
votes
1 answer

Python: Are there any libraries with all the unicode characters similar to the string library for ascii characters?

In python, the string library has methods like string.ascii_letters. Is there anything similar for Unicode characters or symbols? I haven't been able to find anything myself. I appreciate any help! Fairly new to this type of thing so apologies if…