0

I am trying to download an email from outlook sent-items. Currently I am able to save is '.msg' format. Is there anyway through which I can save the mail as '.html' or '.pdf' using python

from pathlib import Path
import win32com.client as win32
from datetime import date, timedelta
import os
import glob

# Create output folder
output_dir = Path.cwd()
output_dir.mkdir(parents=True, exist_ok=True)

# Connect to folder
outlook = win32.Dispatch('outlook.application').GetNamespace("MAPI")

# Connect to folder
sent_items = outlook.GetDefaultFolder(5)

# Get the required mail and store it locally
messages = sent_items.items
message = messages.GetLast()
name = str(message.subject)
message.saveas(os.getcwd()+'//'+name+".msg")

When I tried to replace .msg with .html or .pdf in the last line, then it is not working. The resultant file generated through html or pdf is displayed as special characters and not the actual .msg format

Eugene Astafiev
  • 47,483
  • 3
  • 24
  • 45
Bharath
  • 11
  • 4
  • Try `message.saveas(os.getcwd()+'//'+name+".html", 5)` (5 is Outlook's format const for olHTML) – Alex K. Oct 12 '22 at 14:47
  • @AlexK. Thanks the above solution works, just one question to add to this. In my mail, there is an image attached in the body. But when I save the mail in .html format, the image is not visible. It's coming as blank. Is there any way on how to get the image displayed as well? – Bharath Oct 13 '22 at 09:04

2 Answers2

0

The Outlook object model doesn't provide any property or method for saving messages using the PDF file format. But you can use the OlSaveAsType enumeration for all available file formats. The HTML format (.html) is available. So, you just need to pass the olHTML value for the second parameter in addition to the file path:

message.saveas(os.getcwd()+'//'+name+".html", Outlook.OlSaveAsType.olHTML)

If you really need to save the message using the PDF file format you may consider using the Word object model for that. The Document.ExportAsFixedFormat2 method saves a document in PDF or XPS format. Use the GetInspector method to get the inspector where you may retrieve an instance of the Word Document object which represents the message body. The Inspector.WordEditor property returns the Microsoft Word Document Object Model of the message being displayed. The WordEditor property is only valid if the IsWordMail method returns true and the EditorType property is olEditorWord. The returned Word Document object provides access to most of the Word object model

Eugene Astafiev
  • 47,483
  • 3
  • 24
  • 45
0

You may use this:

outlook = Dispatch("Outlook.Application").GetNamespace("MAPI")

root_folder = outlook.Folders.Item(1)
messages = root_folder.Folders['your folder']
message = messages.Items[0]

#or
#inbox = outlook.GetDefaultFolder(6)
#messages = inbox.items
#message = messages.GetLast()

in_file = os.getcwd()+'//mailfile.doc'

message.SaveAs(in_file , 4) # OlSaveAsType 4 .doc word file

out_file = os.getcwd()+'//mailfile.pdf'

wdFormatPDF = 17
word = Dispatch('Word.Application')
doc = word.Documents.Open(in_file)
doc.SaveAs(out_file, FileFormat=wdFormatPDF)
doc.Close()
word.Quit()
os.remove(in_file)