Questions tagged [python-unicode]

Python distinguishes between byte strings and unicode strings. *Decoding* transforms bytestrings to unicode; *encoding* transform unicode strings to bytes.

Python distinguishes between byte strings and unicode strings. Decoding transforms bytestrings to unicode; encoding transform unicode strings to bytes.

Remember: you decode your input to unicode, work with unicode, then encode unicode objects for output as bytes.

See the

1053 questions
-1
votes
1 answer

How to handle incoming "b'xxx'" string

I got below info from other device: foo = { "abc": "b'E3:DE'" } I know "b" prefix means byte in Python 3. My intent is to convert it into a string. My Python version treats it as unicode type. I tried many ways, none work. The prefix "b" is always…
nathan
  • 754
  • 1
  • 10
  • 24
-1
votes
1 answer

Python unicode code point issues: \xe2\x82\x82 vs. CO\u2082

My program is required to take in inputs but I am having an issues with subscripts such as CO₂... So when i use CO₂ as an argument into the function, it seems to be represented as a string: 'CO\xe2\x82\x82' which is apparently the string…
Cal
  • 1
-1
votes
2 answers

Error (unicode error) 'utf-8' codec can't decode byte - code = compile(f.read(), fname, 'exec')

I'm new with python. I'm trying to run this code: llaves=("España","Francia","Inglaterra") dicPaises={llaves[0]:"Madrid",llaves[1]:"Paris",llaves[2]:"Londres"} print(dicPaises) the result is the following error: Traceback (most recent call last): …
-1
votes
1 answer

"`ascii‘ codec can’t enforce character u’\u2022‘ in position 206: ordinal not in range (128)"

I get this error when I try to execute my Python script on my Ubuntu 20.4 home server: `ascii‘ codec can’t enforce character u’\u2022‘ in position 206: ordinal not in range (128)
philipmrz
  • 31
  • 6
-1
votes
1 answer

Debugging CSV file

I'm dealing with an issue with a CSV file - my code runs perfectly with the old file. But I've recently updated the file to include more websites my script can scrape and now my code is running into an error: UnicodeDecodeError: ‘utf-8’ codec can’t…
Laura092
  • 1
  • 2
-1
votes
2 answers

Cannot open text files in Python 3: Unicode error

I am having a problem open a simple text file in Python 3.8. I setup a simple test. Here is my test code: import os file_path = "c:\Users\username\Documents\folder1\some_file.txt" with open(file_path, 'r') as f: for line in f: …
D Chase
  • 85
  • 7
-1
votes
1 answer

How to correctly read and encode a text sent via flask?

I wrote a simple API that reads and writes files. A minimal version is below: import flask class Web: def __init__(self): self.app = flask.Flask('web') self.app.add_url_rule('/read/', 'read', self.read) …
WoJ
  • 27,165
  • 48
  • 180
  • 345
-1
votes
1 answer

In Python 3.9, I am inserting unicode (beer mug) I am getting an equal number of grey diamonds as I am of proper pictures?

I am writing code to have the user enter the number of bottles of beer on the while and the output should be the verses of the song down to 1. I am trying to get the proper number of unicode mugs of beer to print before the verse (5 mugs, then sing…
Heather M
  • 1
  • 1
-1
votes
1 answer

Streaming with Tweepy: converting unicode characters to letters

The tweets I capture when streaming with Tweepy come in Unicode special characters and I need them to be letters. I have found many solutions on the site but none of them seemed to work or even to apply to my case, since I’m collecting tweets in…
yuko
  • 3
  • 3
-1
votes
1 answer

Python UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

I am getting this error on ubuntu 18.04, using python 3.6: File "/home/sw/miniconda3/envs/py36/lib/python3.6/codecs.py", line 644, in __next__ line = self.readline() File "/home/sw/miniconda3/envs/py36/lib/python3.6/codecs.py", line 557, in…
Sie Tw
  • 133
  • 1
  • 12
-1
votes
3 answers

Split unicode by character into list

I have made a program that reads a selection of names, it is then turned into a Unicode example StevensJohn:-: WasouskiMike:-: TimebombTime:-: etc Is there any way to make a list that would split the index so its like example_list = ["StevensJohn",…
-1
votes
2 answers

Replacing non-UTF-8 from a string

Here is the code: s = 'Waitematā' w = open('test.txt','w') w.write(s) w.close() I get the following error. UnicodeEncodeError: 'charmap' codec can't encode character '\u0101' in position 8: character maps to The string will print with…
-1
votes
1 answer

Convert string of unicode character into unicode type

I have a list of string of gujarati unicode characters and i want to convert them to unicode. But the problem is the escape character(''). for eg a="\\u0aec" print(type(a)) # How to convert it into Unicode which is ('\u0aec')? Also…
sparsh goil
  • 33
  • 1
  • 7
-1
votes
1 answer

Convert in utf16

I am crawling several websites and extract the names of the products. In some names there are errors like this: Malecon 12 Jahre 0,05 ltr.
Reserva Superior Bols Watermelon Lik\u00f6r 0,7l Hayman\u00b4s Sloe Gin Ron Zacapa Edici\u00f3n…
CIC3RO
  • 13
  • 4
-1
votes
3 answers

unexpected result while parsing html with bs4

i try this code in Python 3.8.2: from bs4 import BeautifulSoup import urllib.request html = urllib.request.urlopen( 'https://vietnamnet.vn/').read() soup = BeautifulSoup(html, "html.parser").encode("utf-8") print(soup.title) but i…