0

I'm writing an email application in python. Currently when I try and display any emails using html it just displays the html text. Is there a simple way to convert an email string to just plain text to be viewed?

The relevant part of my code:

rsp, data = self.s.uid('fetch', msg_id, '(BODY.PEEK[HEADER])')
raw_header = data[0][1].decode('utf-8')
rsp, data = self.s.uid('fetch', msg_id, '(BODY.PEEK[TEXT])')
raw_body = data[0][1].decode('utf-8')

header_ = email.message_from_string(raw_header)
body_ = email.message_from_string(raw_body)
self.message_box.insert(END, header_)
self.message_box.insert(END, body_)

Where the message box is just a tkinter text widget to display the email

Thanks

Jay Jen
  • 705
  • 1
  • 7
  • 11
  • if you are trying to deal with html emails, you may better use a html parser then get text out from it. – Anzel Apr 10 '15 at 15:20

1 Answers1

0

Most emails contain both an html version and a plain/text version. For those emails you can just take the plain/text bit. For emails that only have an html version you have to use an html parser like BeautifulSoup to get the text.

Something like this:

message = email.message_from_string(raw_body)

plain_text_body = ''
if message.is_multipart():
    for part in message.walk():       
        if part.get_content_type() == "text/plain":
            plain_text_body = part.get_payload(decode=True)
            break

if plain_text_body == '':
    plain_text_body = BeautifulSoup(message.as_string()).get_text()

Note: I have not actually tested my code, so it probably won't work as is.

JosiahDaniels
  • 2,411
  • 20
  • 37