0

A Japanese Unicode character 'あ's UTF-8 representation is a three bytes sequence, E38182. And I have it in a Jython's list like this;

>>> [0xE3, 0x81, 0x82]
[227, 129, 130]

Can I convert this UTF-8 byte sequcne list to a Jython's unicode string? I want to output 'あ' by printing the unicode string like the following;

str = convert_utf8_list_to_unicode([0xE3, 0x81, 0x82])
print str # => あ

Environment

  • OS: Mac OS X 10.9.3 Mavericks
  • Jython: 2.5.3
  • Java: 1.6.0_65
wataradio
  • 85
  • 10

1 Answers1

1

Try this:

a = [0xE3, 0x81, 0x82]
print "".join([chr(c) for c in a]).decode('UTF-8')

This works in regular Python for me. I don't know if it is different in Jython.

user1524220
  • 160
  • 1
  • 8