5

I am trying to send an email with Chinese characters in the subject line from my program to a gmail account, but the subject line appears as ????. This is how the subject line is encoded:

=?utf-8?B?Rlc6IOiri+W5q+aIkee1piDoiIfkvaDotbfkvobnmoTlkIzkuos=?=

Is there anything wrong in the encoding? Is there anything that I have to bear in mind? The mail also contains Chinese characters in the body, but those get displayed just fine. I am using base64 to encode the body.

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
Lakshmie
  • 331
  • 6
  • 17

2 Answers2

2

For those interested in the answer to this question, this string is a MIME header encoded as per RFC2047. =?utf-8?B?Rlc6IOiri+W5q+aIkee1piDoiIfkvaDotbfkvobnmoTlkIzkuos=?= means it uses the UTF-8 charset, B means Base 64 encoding.

In PHP, use iconv_mime_decode.

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
Allen Hamilton
  • 1,148
  • 9
  • 14
2

=?utf-8?B?Rlc6IOiri+W5q+aIkee1piDoiIfkvaDotbfkvobnmoTlkIzkuos=?= is encoded by base64, and the string-buffer(after decoded by base64) is encoded by utf-8.

You can decode it in python:

>>> from base64 import b64decode
>>> b64decode(b'Rlc6IOiri+W5q+aIkee1piDoiIfkvaDotbfkvobnmoTlkIzkuos=').decode('utf-8')
'FW: 請幫我給 與你起來的同事'

Also in python:

>>> from email.header import decode_header
>>> decode_header('=?utf-8?B?Rlc6IOiri+W5q+aIkee1piDoiIfkvaDotbfkvobnmoTlkIzkuos=?=')
[(b'FW: \xe8\xab\x8b\xe5\xb9\xab\xe6\x88\x91\xe7\xb5\xa6 \xe8\x88\x87\xe4\xbd\xa0\xe8\xb5\xb7\xe4\xbe\x86\xe7\x9a\x84\xe5\x90\x8c\xe4\xba\x8b', 'utf-8')]
>>> _[0][0].decode(_[0][1])
'FW: 請幫我給 與你起來的同事'

Or in bash(maybe you should pipe to iconv):

~ $ echo Rlc6IOiri+W5q+aIkee1piDoiIfkvaDotbfkvobnmoTlkIzkuos= | base64 -d
FW: 請幫我給 與你起來的同事
kev
  • 155,172
  • 47
  • 273
  • 272