1

Using regular expressions I have managed to extract all sender addresses from the emails located in my Inbox, However I've tried and failed many times to also extract the associated UIDs for those individual emails.

Here's what I have so far:

    result, data = mail.search(None, 'ALL')
ids = data[0]
id_list = ids.split()
for i in id_list:
  typ, data = mail.fetch(i,'(RFC822)')
  for response_part in data:
    if isinstance(response_part, tuple):
      msg = email.message_from_bytes(response_part[1])
      sender = msg['from'].split()[-1]
      address = re.sub(r'[<>]','',sender)
# Ignore any occurences of own email address and add to list
  if not re.search(r'' + re.escape(LOGIN),address) and not address in email_list:
    email_list.append(address)
    print address

The output is slow (I'm assuming because of regular expressions) but none the less it gets the job done.

Output:

    no-reply@mail.instagram.com
    no-reply@accounts.google.com
    rhodesi926@icloud.com
    wat@elevenyellow.com
    pinbot@notifications.pinterest.com
    support@autopin.co
    pinbot@account.pinterest.com
    info@shootbox.me
    pinbot@explore.pinterest.com
    bugra@boostfy.co
    mail-noreply@google.com
    pinbot@inspire.pinterest.com
    mua@mikasabeauty.com
    noreply@apple.com
    privacy-noreply@policies.google.com

Part of the problem is I don't understand how the UIDs are connected to the sender and where the UIDs get stored in the IMAP structure.

Im assuming I could right a regular expression that could pull any 4 digit combination of numbers from the "UID:" Field, I then fear it will slow my script down to a crawl....

If anyone understands Imaplib and can help I would be eternally grateful. Thank You.

  • UIDs aren't part of the email headers, you won't get them from fetching `RFC822`. You need to add `UID` to your fetch list (`(UID RFC822)`), and then parse `response_part[0]` and the non-tuple pieces (which may the non `LITERAL` parts). The reason this is slow is because you're fetching the messages one at a time, not due to RE. You could do a full MIME parse hundreds of times in one network round trip. IMAPLIB does not include an IMAP parser so you'll have lots of work to do, or you should find a higher level library. – Max Jun 27 '18 at 00:40

0 Answers0