0

I'm making a config file that contains the map of emoji's Unicode and SoftBank Unicode. Now I'm using a python program to scrach this information from http://punchdrunker.github.com/iOSEmoji/table_html/ios6/index.html

but there is a problem , the SoftBank Code on the web page is UTF8 hex, not Unicode codepoint , how to change it to Unicode codePoint?

for example , I want to change EE9095 to E415 (the first emoji emotion)

I try to do it like this , but it just didn't work

code.decode('utf-8')

but it just didn't work, the code is the same, didn't change. the unix command iconv didn't work too

user1462782
  • 19
  • 1
  • 5

2 Answers2

4

Are you sure code is actually encoded in UTF-8? This works for me:

>>> b'\xee\x90\x95'.decode('utf-8')
u'\ue415'
kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
  • I scrached code form the webpage, is it be encoded already?, I do it like this code='ee9095' code.decode('utf-8') – user1462782 Nov 13 '12 at 07:48
  • 3
    @user1462782: `'ee9095'` will be a string of 6 bytes ('e', 'e', '9', '0', '9', '5'). This is different from `'\xee\x90\x95'` which is a string of 3 bytes (0xee, 0x90, 0x95). You need to convert the hex string `'ee9095'` to an actual byte sequence using e.g. `bytearray.fromhex('ee9095').decode('utf-8')` – kennytm Nov 13 '12 at 08:01
4

How about this:

>>> 'EE9095'.decode('hex').decode('utf-8')
<<< u'\ue415'
evandrix
  • 6,041
  • 4
  • 27
  • 38
plaes
  • 31,788
  • 11
  • 91
  • 89