4

I have the below code to download email attachments based on date sent and email subject criteria:

from datetime import date, timedelta
import os
import win32com.client


path = os.path.expanduser("C:\\Users\\xxxx\\Documents\\Projects\\VBA Projects\\VLOOKUP Automation\\Vlookup File Location")
today = date.today()

outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.Folders("xxx").Folders.Item("Inbox")
messages = inbox.Items
subject = "xxx"

dateHigh = date.today() - timedelta(days=1)
dateLow = date.today() - timedelta(days=-1)

max = 2500
for count, message in enumerate(messages):
    if count > max:
        break
    if subject in message.subject and message.senton.date() > dateLow and message.senton.date() < dateHigh:
            attachments = message.Attachments
            num_attach = len([x for x in attachments])
            for x in range(1, num_attach+1):
                attachment = attachments.Item(x)
                attachment.SaveASFile(path + '\\' + str(attachment))

Is there any way to specify criteria for only .csv attachments to be downloaded for example?

Additionally, this code was previously being used on a public folder - those folders have now been updated to shared folders. Since the update, I have had to increase the "max" from 500 to 2500 in order to find the specified emails. Is there any way to speed this up?

Thanks

Saadiq
  • 101
  • 2
  • 15

2 Answers2

2

Below is a way to specify which file types you want.

Please enter the file endings in the attachments_of_interest list.

from datetime import date, timedelta
import os
import win32com.client


path = os.path.expanduser("C:\\Users\\xxxx\\Documents\\Projects\\VBA Projects\\VLOOKUP Automation\\Vlookup File Location")
today = date.today()

outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.Folders("xxx").Folders.Item("Inbox")
messages = inbox.Items
subject = "xxx"

dateHigh = date.today() - timedelta(days=1)
dateLow = date.today() - timedelta(days=-1)

max_n = 2500
attachments_of_interest = ['.csv']

for count, message in enumerate(messages):
    if count > max_n:
        break
    if subject in message.subject and message.senton.date() > dateLow and message.senton.date() < dateHigh:
        attachments = message.Attachments
        num_attach = len([x for x in attachments])
        for x in range(1, num_attach+1):
            attachment = attachments.Item(x)
            attachment_fname = str(attachment)
            file_ending = attachment_fname.split('.')[-1]
            if not attachments_of_interest or file_ending in attachments_of_interest:
                attachment.SaveASFile(path + '\\' + attachment_fname)

As for speeding up, you could use a pool:

from multiprocessing.pool import ThreadPool as Pool
from datetime import date, timedelta
import os
import win32com.client


path = os.path.expanduser("C:\\Users\\xxxx\\Documents\\Projects\\VBA Projects\\VLOOKUP Automation\\Vlookup File Location")
today = date.today()

outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.Folders("xxx").Folders.Item("Inbox")
messages = inbox.Items
subject = "xxx"

max_n = 2500
attachments_of_interest = ['.csv']
pool_size = 5

# define worker function before a Pool is instantiated
def worker(message):
    dateHigh = date.today() - timedelta(days=1)
    dateLow = date.today() - timedelta(days=-1)
    if subject in message.subject and message.senton.date() > dateLow and message.senton.date() < dateHigh:
        attachments = message.Attachments
        num_attach = len([x for x in attachments])
        for x in range(1, num_attach+1):
            attachment = attachments.Item(x)
            attachment_fname = str(attachment)
            file_ending = attachment_fname.split('.')[-1]
            if not attachments_of_interest or file_ending in attachments_of_interest:
                attachment.SaveASFile(path + '\\' + attachment_fname)

pool = Pool(pool_size)

for count, message in enumerate(messages):
    if count > max_n:
        break
    pool.apply_async(worker, (message,))

pool.close()
pool.join()
alexisdevarennes
  • 5,437
  • 4
  • 24
  • 38
  • hey! Thanks for the answer, however, when implementing the code it seems that no attachments are being saved at all. – Saadiq Nov 19 '19 at 13:11
  • Could you please try again, I've updated my code. Make sure that file extensions you want to catch are specified in the list. I updated the code so that if the list is empty it still works (for all filetypes) – alexisdevarennes Nov 19 '19 at 15:11
  • hey, copied that exact code, and included my required extension. still not downloading anything. Used my old code to test it on the same emails and attachments are being downloaded, albeit both pdf and xlsx where I just need xlsx obviously. but just to confirm that there isn't a different issue – Saadiq Nov 20 '19 at 08:44
  • Could you please print out what attachment_fname looks like and give me some examples? This will help me debug, a bit hard to do it without your setup (on ubuntu and I have no outlook) – alexisdevarennes Nov 20 '19 at 09:09
  • Hey, by the attachment_fname do you want me to provide you with the full file name and extension as it appears on the email? I am not entirely sure what you're referring to – Saadiq Nov 20 '19 at 13:02
  • If you could do print(attachment_fname) right after attachment_fname = str(attachment) and then print(file_ending) right after file_ending = attachment_fname.split('.')[-1] that'd be awesome. – alexisdevarennes Nov 20 '19 at 15:18
  • okay so i've done that and nothing is being printed at all, i have no idea why – Saadiq Nov 21 '19 at 10:17
  • Hmm that's weird, did you adapt the subject variable so that it matches? It should work given it's the same logic you posted :) – alexisdevarennes Nov 21 '19 at 13:20
  • Hey, yeah I did. Lol I have no idea whats up with it. Thanks for all your help but I really have no idea – Saadiq Nov 26 '19 at 14:42
0

I think this is part of requirement to download csv only. This outlook component has some methods which you can utilize. Instead of messages = inbox.Items try messages = inbox.Items.GetFirst() and get first message then use

messages = inbox.Items.oItems.GetNext() so in this way you always have one message in memory and you can keep looping for longer time.

Make sure you have outlook Microsoft Outlook 16.0 Object Library or higher than 10 so that this method exists. GetFirst() c# code used by me

Outlook.MailItem oMsg = (Outlook.MailItem)oItems.GetFirst();

                    //Output some common properties.
                    Console.WriteLine(oMsg.Subject);
                    Console.WriteLine(oMsg.SenderName);
                    Console.WriteLine(oMsg.ReceivedTime);
                    Console.WriteLine(oMsg.Body);

                    //Check for attachments.
                    int AttachCnt = oMsg.Attachments.Count;
                    Console.WriteLine("Attachments: " + AttachCnt.ToString());
                Outlook.MailItem oMsg1 = (Outlook.MailItem)oItems.GetNext();
Jin Thakur
  • 2,711
  • 18
  • 15