I am making a program to translate specific Japanese characters to their English spelling from an external text file using the replace()
function, but I am facing a strange error.
I made sure to encode all the characters in the text file and then put it to an variable and then start the replace process in bytes level on that variable, then after that it gets decoded again into strings and then gets written to a new text file.
path = input('Location: ').strip('"')
txt = ''
with open(path,'rb') as f:
txt = f.read()
def convert(jchar,echar):
ct = txt.replace(jchar.encode('utf-8'),echar.encode('utf-8'))
return ct
txt = convert('ぁ','a')
txt = convert('っ','su')
with open('Translated.txt','w') as tf:
tf.write(txt.decode('utf-8'))
input('Done.')
If the text file includes all the Japanese characters that are replaceable in the script everything goes perfectly, but if the text file contains an Japanese character that's not replaceable in the script i get this error:
UnicodeEncodeError: 'charmap' codec can't encode character '\u306e' in position 6: character maps to <undefined>
And by that, python seems to not be able to decode a Japanese character's bytes again after encoding it.
And the worst is that there are even some other non-Unicode characters that even if I made it replaceable on the python script, I still get the same error, which means that python cannot even encode it, but my main focus now is why python refuses to decode the bytes of a Japanese character although that python it self was able to encode it.