0

I am trying to read an email from my gmail inbox in Python3. So I followed this tutorial : https://www.thepythoncode.com/article/reading-emails-in-python

My code is the following :

 username = "*****@gmail.com"
password = "******"
# create an IMAP4 class with SSL 
imap = imaplib.IMAP4_SSL("imap.gmail.com")
# authenticate
imap.login(username, password)
status, messages = imap.select("INBOX")
# total number of emails
messages = int(messages[0])
    

for i in range(messages, 0, -1):
    # fetch the email message by ID
    res, msg = imap.fetch(str(i), "(RFC822)")
    for response in msg:
        if isinstance(response, tuple):
            # parse a bytes email into a message object
            msg = email.message_from_bytes(response[1])
            # decode the email subject
            subject = decode_header(msg["Subject"])[0][0]
            if isinstance(subject, bytes):
                # if it's a bytes, decode to str
                subject = subject.decode()
            # email sender
            from_ = msg.get("From")
            # if the email message is multipart
            if msg.is_multipart():
                # iterate over email parts
                for part in msg.walk():
                    # extract content type of email
                    content_type = part.get_content_type()
                    content_disposition = str(part.get("Content-Disposition"))

                    # get the email body
                    body = part.get_payload(decode=True).decode()
                    print(str(body))

    imap.close()
    imap.logout()
    print('DONE READING EMAIL')

The libraries I am using is :

import imaplib
import email
from email.header import decode_header

However, when I execute it I get the following error message, which I don't understand because I never have an empty email in my inbox ...

Traceback (most recent call last):

  File "<ipython-input-19-69bcfd2188c6>", line 38, in <module>
    body = part.get_payload(decode=True).decode()

AttributeError: 'NoneType' object has no attribute 'decode'

Anyone has an idea what my problem could be ?

lolaa
  • 181
  • 1
  • 4
  • 11
  • 1
    `part.get_payload(decode=True)` returns `None`. – Jan Jul 20 '20 at 06:37
  • I know but I do not undertsand why because all my emails have content in both the subject and the body – lolaa Jul 20 '20 at 06:40
  • Add an if clause around it and see if it's really true. – Jan Jul 20 '20 at 06:42
  • 1
    The result with `decode=True` is already decoded; why do you attempt to `.decode()` it again? – tripleee Jul 20 '20 at 06:45
  • Ok thank you this was the problem. @tripleee – lolaa Jul 20 '20 at 06:49
  • The tutorial you are following had a `try:`/`except` around this single statement. Now you know why. Without a sample message, we can't really tell you why it didn't have a payload; I'm vaguely guessing some multipart container which by itself doesn't contain any payloads. – tripleee Jul 20 '20 at 06:50

2 Answers2

2

From the documentation:

Optional decode is a flag indicating whether the payload should be decoded or not, according to the Content-Transfer-Encoding header. When True and the message is not a multipart, the payload will be decoded if this header’s value is quoted-printable or base64. If some other encoding is used, or Content-Transfer-Encoding header is missing, or if the payload has bogus base64 data, the payload is returned as-is (undecoded). If the message is a multipart and the decode flag is True, then None is returned. The default for decode is False.

(Note: this link is for python2 - for whatever reason the corresponding page for python3 doesn't seem to mention get_payload.)

So it sounds like either some part of some message:

  • is missing the content-transfer-encoding (gives email.message no indication how it is meant to be decoded), or
  • is using an encoding other than QP or base64 (email.message does not support decoding it), or
  • claims to be base-64 encoded but contains a wrongly encoded string that cannot be decoded

The best thing to do is probably just to skip over it.

Replace:

                    body = part.get_payload(decode=True).decode()

with:

                    payload = part.get_payload(decode=True)
                    if payload is None:
                        continue
                    body = payload.decode()

Although I am not sure whether the decode() method that you are calling on the payload is doing anything useful beyond the decoding that get_payload has already done when using decode=True. You should probably test this, and if you find that this call to decode does not do anything (i.e. if body and payload are equal), then you would probably omit this step entirely:

                    body = part.get_payload(decode=True)
                    if body is None:
                        continue

If you add some print statements regarding from_ and subject, you should be able to identify the affected message(s), and then go to "show original" in gmail to compare, and see exactly what is going on.

alani
  • 12,573
  • 2
  • 13
  • 23
  • The `email` lbrary changed significantly in Python 3.5 (I think it was); old code will continue to work, but you'll need to consult older documentation for the deprecated interfaces and methods. On the whole, the new and overhalued library is significantly more versatile and robust, whilst still maintaining some compatibility with the old version and its conventions. – tripleee Jul 20 '20 at 11:12
  • @alani Can you check this one https://stackoverflow.com/questions/67944097/how-to-extract-the-body-of-an-email-and-save-the-attachments-using-python-imap – Amogh Katwe Jun 14 '21 at 21:07
0

High level imap lib may helps here:

from imap_tools import MailBox

# get emails from INBOX folder
with MailBox('imap.mail.com').login('test@mail.com', 'pwd', 'INBOX') as mailbox:
    for msg in mailbox.fetch():
        msg.uid              # str or None: '123'
        msg.subject          # str: 'some subject 你 привет'
        msg.from_            # str: 'sender@ya.ru'
        msg.to               # tuple: ('iam@goo.ru', 'friend@ya.ru', )
        msg.cc               # tuple: ('cc@mail.ru', )
        msg.bcc              # tuple: ('bcc@mail.ru', )
        msg.reply_to         # tuple: ('reply_to@mail.ru', )
        msg.date             # datetime.datetime: 1900-1-1 for unparsed, may be naive or with tzinfo
        msg.date_str         # str: original date - 'Tue, 03 Jan 2017 22:26:59 +0500'
        msg.text             # str: 'Hello 你 Привет'
        msg.html             # str: '<b>Hello 你 Привет</b>'
        msg.flags            # tuple: ('SEEN', 'FLAGGED', 'ENCRYPTED')
        msg.headers          # dict: {'Received': ('from 1.m.ru', 'from 2.m.ru'), 'AntiVirus': ('Clean',)}

        for att in msg.attachments:  # list: [Attachment]
            att.filename         # str: 'cat.jpg'
            att.content_type     # str: 'image/jpeg'
            att.payload          # bytes: b'\xff\xd8\xff\xe0\'

https://github.com/ikvk/imap_tools

Vladimir
  • 6,162
  • 2
  • 32
  • 36