0

I need to make the cp932 (it is expanded shift-jis)

UnicodeEncodeError: 'cp932' codec can't encode character '\u270c' in position 0: illegal multibyte sequence

    import codecs
    mytext = '\u270c'
    with codecs.open(path,mode='w',encoding='cp932') as f:
        mytext.encode('cp932',"ignore")
        f.write(mytext)
    exit()

I just simplify the mytext for this article.

I think this character pass the encode with ignore flg.

However, write shows the error.

Is there any way to solve this??

whitebear
  • 11,200
  • 24
  • 114
  • 237

2 Answers2

0

\ is functional symbol in cp932. So, If you want to encode \ you should use the \\
in your case :

import codecs
mytext = '\\u270c'
with codecs.open(path,mode='w',encoding='cp932') as f:
    mytext.encode('cp932',"ignore")
    f.write(mytext)
exit()
Lukas
  • 64
  • 5
  • `'\u270c'` and `'\\u270c'` are two different things. – lenz Feb 07 '20 at 11:29
  • @lenz I can't understand what you mean ... that code works as his intention. – Lukas Feb 08 '20 at 14:25
  • Encoding isn't the same thing as escaping. The OP asked about encoding, you're talking about escaping. `'\u270c'` is a single-character string, but when you escape the backslash as `'\\u270c'`, you get a string with six characters – check `len('\u270c'), len('\\u270c')`. When you *encode* either of the strings, you get a `bytes` object (not `str` anymore); the length depends on the encoding chosen. – lenz Feb 08 '20 at 15:32
  • Oh, I didn't know that difference... I'm still a beginner. I'm ashamed of myself. thank you lenz. – Lukas Feb 08 '20 at 15:47
0

In your example, the file f will expect Unicode strings to be passed to f.write() and they will be encoded as declared by codecs.open, so the code is trying to double-encode. Also, '\u270c' is not a character supported by CP932, so it can't be written to a CP932 file in any case.

Assuming Python 3, to write a Unicode string text in a particular encoding, use:

with open('output.txt','w',encoding='cp932') as f:
    f.write(text)

codecs is an older module and isn't needed. In Python 2, io.open is the backported equivalent to Python 3's open and is also supported by Python 3, for portability.

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251