-2

I'm retrieving usernames from an API and I'm getting values like :

\u00e2\u0098\u0085Random Name\u00e2\u0098\u0085 <3

When I try to print it out, I end up getting:

★Random Name★ <3

But it's supposed to be:

★Random Name★ <3

\u00e2\u0098\u0085 seems to be a constructor for and it looks like unicode escape sequences, but clearly something's going wrong in the conversion.
Need some help on how to go about partially unescaping the string.

Edit: (Additional details)
I'm trying to create a discord bot that regularly updates roles based on retrieved player information.

The first value above is exactly what I get from the API and I can't do anything to change that

★Random Name★ <3 is the message being posted in the server.
I'm certain discord supports the character set because I can directly paste the username without issues

data = json.loads(response.text) is used to parse and store the response
await message.channel.send(f"Name: {data['name']}") is used to send the message

send() is a function part of discord.py to send messages in a specific channel

pauljk
  • 1
  • 2
  • 2
    You're looking at bytes that have been decoded from `latin-1` that should have been `utf-8`. Without seeing the code that generated it it's hard to know how to fix. – Mark Ransom Apr 29 '21 at 17:57
  • 1
    Can you show us the code you use to "try to convert it", so we can talk about how to modify that code with more specificity? – Charles Duffy Apr 29 '21 at 17:58
  • 2
    Something (either the API, or the interface to it) is broken. E2 98 85 is the UTF-8 encoding of Unicode code point U+2605. You shouldn't be getting those encoding-specific bytes in a `str` value. – chepner Apr 29 '21 at 17:58
  • 2
    There might actually be an encoding issue in your post itself. I just tried fixing the formatting, but the [revision history](/posts/67322645/revisions) says I changed the data too, which is incorrect, but I also can't fix it no matter what I try. To get around that, please provide a [mre] including your code and actual input data (cause we need to know the format: bytes, str, etc). You can [edit] the question. BTW, welcome to SO! Check out the [tour], and [ask] if you want tips. – wjandrea Apr 29 '21 at 18:00
  • The reason I'm asking for the actual input data is because **that's not a string**. Strings are contained in quotes. If you paste that into a Python console, you get `SyntaxError: unexpected character after line continuation character`. Try doing `print(repr(x))` where `x` is the input data. – wjandrea Apr 30 '21 at 00:48
  • When you say "when I try to convert it", what exactly are you doing? I'm sure the problem is there, but you haven't shown us that code yet! – Mark Ransom Apr 30 '21 at 02:10
  • Knowing that the string comes from JSON helps. See if [json.dumps \u escaped unicode to utf8](https://stackoverflow.com/q/38620471/5987) helps. – Mark Ransom Apr 30 '21 at 02:17

2 Answers2

0
print('\u00e2\u0098\u0085Random Name\u00e2\u0098\u0085 <3'.encode('latin').decode())

result:

★Random Name★ <3
Willian Vieira
  • 646
  • 3
  • 9
-1

We can't give a specific answer without looking at the code but you should encode it like this:

username = (API call)
username = username.encode('utf-8')
print(username)

Your code might be different but since we don't have access to it this is the best that we can give.

Quessts
  • 440
  • 2
  • 19