I'm trying to decode Japanese strings in a loop that reads a file with shift-jis
.
It works, but when it contains circled numbers characters like "①", I get the following error:
UnicodeDecodeError: 'shift_jis' codec can't decode bytes in position 24-25: illegal multibyte sequence
Some of the code:
def read_short(data):
return unpack('>h', data.read(2))[0]
def read_string(data):
length = read_short(data)
return unpack(str(length) + 's', data.read(length))[0].decode('shift-jis')
test = read_string(data)
Is there a Japanese codec able to read that type of chars or do I have to find to way to convert it beforehand?