24

I am displaying new email with IMAP, and everything looks fine, except for one message subject shows as:

=?utf-8?Q?Subject?=

How can I fix it?

ataylor
  • 64,891
  • 24
  • 161
  • 189
janeh
  • 3,734
  • 7
  • 26
  • 43

6 Answers6

33

In MIME terminology, those encoded chunks are called encoded-words. You can decode them like this:

import email.header
text, encoding = email.header.decode_header('=?utf-8?Q?Subject?=')[0]

Check out the docs for email.header for more details.

Basj
  • 41,386
  • 99
  • 383
  • 673
ataylor
  • 64,891
  • 24
  • 161
  • 189
  • 1
    In both Python 2 and Python 3, `email.header.decode_header` (with lower-case `m`) is the generic name. In addition, in your code, `text` is not actually a text, but instead a bytes variable. – phihag Apr 19 '16 at 10:49
14

This is a MIME encoded-word. You can parse it with email.header:

import email.header

def decode_mime_words(s):
    return u''.join(
        word.decode(encoding or 'utf8') if isinstance(word, bytes) else word
        for word, encoding in email.header.decode_header(s))

print(decode_mime_words(u'=?utf-8?Q?Subject=c3=a4?=X=?utf-8?Q?=c3=bc?='))
Community
  • 1
  • 1
phihag
  • 278,196
  • 72
  • 453
  • 469
  • Could you rewrite that in a more Pythonic fashion? – wbg Feb 12 '19 at 18:25
  • @wbg What's not Pythonic about this code? What would you change? Looking at it now, it seems rather well-written to me, and a paragon of Python's expressiveness. Maybe the [generator expression](https://docs.python.org/dev/reference/expressions.html#generator-expressions) is tripping up @deterjan? If you're just targeting Python 3, you can skip the `if isinstance(word, bytes) else word` and the `u` before the `'`; this code has been engineered to work on both Python 2 and 3. – phihag Feb 13 '19 at 08:40
14

The text is encoded as a MIME encoded-word. This is a mechanism defined in RFC2047 for encoding headers that contain non-ASCII text such that the encoded output contains only ASCII characters.

In Python 3.3+, the parsing classes and functions in email.parser automatically decode "encoded words" in headers if their policy argument is set to policy.default

>>> import email
>>> from email import policy

>>> msg = email.message_from_file(open('message.txt'), policy=policy.default)
>>> msg['from']
'Pepé Le Pew <pepe@example.com>'

The parsing classes and functions are:

Confusingly, up to at least Python 3.10, the default policy for these parsing functions is not policy.default, but policy.compat32, which does not decode "encoded words".

>>> msg = email.message_from_file(open('message.txt'))
>>> msg['from']
'=?utf-8?q?Pep=C3=A9?= Le Pew <pepe@example.com>'
snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
6

Try Imbox

Because imaplib is a very excessive low level library and returns results which are hard to work with

Installation

pip install imbox

Usage

from imbox import Imbox

with Imbox('imap.gmail.com',
        username='username',
        password='password',
        ssl=True,
        ssl_context=None,
        starttls=False) as imbox:

    all_inbox_messages = imbox.messages()
    for uid, message in all_inbox_messages:
        message.subject
Artem Bernatskyi
  • 4,185
  • 2
  • 26
  • 35
  • +1 truly this is for humans. Indeed imbox was able to decode otherwise base64-encoded (in imaplib and the like) subject and other fields on-the-fly. However, be aware if some field is missing the KeyError will be thrown. – Anatoly Alekseev Oct 09 '18 at 14:36
3

In Python 3, decoding this to an approximated string is as easy as:

from email.header import decode_header, make_header

decoded = str(make_header(decode_header("=?utf-8?Q?Subject?=")))

See the documentation of decode_header and make_header.

Tzach
  • 12,889
  • 11
  • 68
  • 115
0

High level IMAP lib may be useful here: imap_tools

from imap_tools import MailBox, AND

# get list of email subjects from INBOX folder
with MailBox('imap.mail.com').login('test@mail.com', 'pwd', 'INBOX') as mailbox:
    subjects = [msg.subject for msg in mailbox.fetch()]
  • Parsed email message attributes
  • Query builder for searching emails
  • Actions with emails: copy, delete, flag, move, seen
  • Actions with folders: list, set, get, create, exists, rename, delete, status
  • No dependencies
Vladimir
  • 6,162
  • 2
  • 32
  • 36