4

I got this data returned b'\\u041a\\u0435\\u0439\\u0442\\u043b\\u0438\\u043d\\u043f\\u0440\\u043e from an API. This data is in Russian which I know for sure. I am guessing these values are unicode representation of the cyrillic letters?

The data returned was a byte array.

How can I convert that into readable cyrillic string? Pretty much I need a way to convert that kind into readable human text.

EDIT: Yes this is JSON data. Forgot to mention, sorry.

Govind Parmar
  • 20,656
  • 7
  • 53
  • 85
user1757703
  • 2,925
  • 6
  • 41
  • 62

1 Answers1

5

Chances are you have JSON data; JSON uses \uhhhh escape sequences to represent Unicode codepoints. Use the json.loads() function on unicode (decoded) data to produce a Python string:

import json

string = json.loads(data.decode('utf8'))

UTF-8 is the default JSON encoding; check your response headers (if you are using a HTTP-based API) to see if a different encoding was used.

Demo:

>>> import json
>>> json.loads(b'"\\u041a\\u0435\\u0439\\u0442\\u043b\\u0438\\u043d\\u043f\\u0440\\u043e"'.decode('utf8'))
'Кейтлинпро'
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Ahh wonderful. I understand. I was getting a like freaked out thinking there is like a unique way to handle non-ascii chars. – user1757703 May 27 '14 at 18:12