2

I am working on a two stages Digital Forensics project, and on stage one, I need to extract all the messages stored on several outlook's PST/OST files, and save them as MSG files in a folder hierarchy like pstFilename\inbox, draft, sent... for each PST file in the sample.

For stage two, now completed, I am using python (3.x) and the Win32Com module to traverse all subfolder inside the target folder, search and hash every MSG file, parse a number of MSG properties and finally, create a CSV report. I found plenty of documentation and code samples to parse a MSG file using python and the Win32Com module, but not so much on how to parse a single PST file other than the PST file associated to Outlook's user profile on the local computer.

I am looking for a way to open a PST file using the win32Com module, traverse all folders in it, and export/save every message as a MSG file to the corresponding pstfilename_folder\subfolder.

There is a very straightforward method to access MSG files:


import win32com.client

outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
msg = outlook.OpenSharedItem(r"/test_files/test.msg")

print(msg.SenderName)
print(msg.SenderEmailAddress)
print(msg.SentOn)
print(msg.To)
print(msg.CC)
print(msg.BCC)
print(msg.Subject)
print(msg.Body)

count_attachments = msg.Attachments.Count
if count_attachments > 0:
    for item in range(count_attachments):
        print(msg.Attachments.Item(item + 1).Filename)

del outlook, msg

Is there any equivalent method to access and manipulate a PST file using the win32com module?

I found this link: https://learn.microsoft.com/en-us/dotnet/api/microsoft.office.interop.outlook.store?view=outlook-pia

but I not sure how to use it in python...

HIC
  • 21
  • 1
  • 4

2 Answers2

2

This is something that I want to do for my own application. I was able to piece together a solution from these sources:

  1. https://gist.github.com/attibalazs/d4c0f9a1d21a0b24ff375690fbb9f9a7
  2. https://github.com/matthewproctor/OutlookAttachmentExtractor
  3. https://learn.microsoft.com/en-us/office/vba/api/outlook.namespace

My solution doesn't save the .msg files as you request in your question, but unless you have a secondary use for outputting the files this solution should save you a step.

import win32com.client

def find_pst_folder(OutlookObj, pst_filepath) :
    for Store in OutlookObj.Stores :
        if Store.IsDataFileStore and Store.FilePath == pst_filepath :
            return Store.GetRootFolder()
    return None

def enumerate_folders(FolderObj) :
    for ChildFolder in FolderObj.Folders :
        enumerate_folders(ChildFolder)
    iterate_messages(FolderObj)

def iterate_messages(FolderObj) :
    for item in FolderObj.Items :
        print("***************************************")
        print(item.SenderName)
        print(item.SenderEmailAddress)
        print(item.SentOn)
        print(item.To)
        print(item.CC)
        print(item.BCC)
        print(item.Subject)

        count_attachments = item.Attachments.Count
        if count_attachments > 0 :
            for att in range(count_attachments) :
                print(item.Attachments.Item(att + 1).Filename)

Outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")

pst = r"C:\Users\Joe\Your\PST\Path\example.pst"
Outlook.AddStore(pst)
PSTFolderObj = find_pst_folder(Outlook,pst)
try :
    enumerate_folders(PSTFolderObj)
except Exception as exc :
    print(exc)
finally :
    Outlook.RemoveStore(PSTFolderObj)
Joe Cole
  • 101
  • 8
0

I use it in my work MSG PY module from independent soft and it turns out great for now. This is Microsoft Outlook .msg file module for Python. The module allows you to easy create/read/parse/convert Outlook .msg files. For example:

from independentsoft.msg import Message

appointment = Message("e:\\appointment.msg")

print("subject: " + str(appointment.subject))
print("start_time: " + str(appointment.appointment_start_time))
print("end_time: " + str(appointment.appointment_end_time))
print("location: " + str(appointment.location))
print("is_reminder_set: " + str(appointment.is_reminder_set))
print("sender_name: " + str(appointment.sender_name))
print("sender_email_address: " + str(appointment.sender_email_address))
print("display_to: " + str(appointment.display_to))
print("display_cc: " + str(appointment.display_cc))

print("body: " + str(appointment.body))

Taki
  • 11
  • 1