0

I'm trying to parse the body of a forwarded email using the following Python code

import imapclient
import os
import pprint
import pyzmail
import email

#my email info
EMAIL_ADRESS = os.environ.get('DB_USER')
EMAIL_PASSWORD = os.environ.get('PYTHON_PASS')

#login to my email
imap0bj =  imapclient.IMAPClient('imap.gmail.com', ssl = True)
imap0bj.login(EMAIL_ADRESS, EMAIL_PASSWORD )
print("ok")


pprint.pprint(imap0bj.list_folders())
#Selecting my Inbox
imap0bj.select_folder('INBOX', readonly = True)

#Getting UIDs from Inbox
UIDs = imap0bj.search(['SUBJECT', 'Contact FB Applicant', 'ON', '16-Oct-2020'])
print(UIDs)


rawMessages = imap0bj.fetch(UIDs, ['BODY[]'])
message = pyzmail.PyzMessage.factory(rawMessages[9999][b'BODY[]'])

message.text_part != None
#Body of the email returned as a string
msg = message.text_part.get_payload().decode(message.text_part.charset)

print(msg)

imap0bj.logout()

This code outputs a string similar to this

   ---------- Forwarded message ---------
    From: Someone <Mail@mail.biz>
    Date: Wed, Oct 14, 2020 at 1:23 PM
    Subject: Fwd:  Contact FB Applicant
    To: <mail@mail.com>
    
    
    
    
   ---------- Forwarded message ---------
    From: Someone <Mail@mail.biz>
    Date: Wed, Oct 14, 2020 at 1:23 PM
    Subject: Fwd:  Contact FB Applicant
    To: <mail@mail.com>
    
    
    The following applicant filled out the form via Facebook.  Contact
    immediately.
    
    Some Guy
    999999999999
    mail@mail.com

But I don't want the "Forwarded message" parts. I just want it from "The following applicant..." and onwards which is the info I care about. How do I get rid of the other stuff? I'd really appreciate the help. Thank you!

2 Answers2

0

You can use io.StringIO

Here's how you would use it.

from io import StringIO

# your code goes here
...
...

msg = message.text_part.get_payload().decode(message.text_part.charset)

sio = StringIO(msg)

sio.seek(msg.index('The following applicant'))

for line in sio:
  print(line)

How it works:

StringIO allows you to treat your string as a stream (file). StringIO.seek moves streams position to a particular place. (0 is the beginning of the stream) str.index returns 1st location of a string within a string. Putting it all together: you move the beginning of the stream to the 1st occurrence of the string you want, and then just read from the stream.

Alex
  • 111
  • 7
0

Judging from this format, you need to read line by line. If you encounter a line that starts with '---', like line[:3]='---' You ignore it and the lines after it until you read an empty line, If it starts with '---' again, repeat the process Then the first non-empty line should be "The following applicant..."

You can burry this code in an infinite loop and break, here is pseudo-code

while True:
  line = read next line
  if length(line) ==0: continue
  if line[:3] = '---'
    while true:
      line = read next line
      if line:
        break
      else:
        continue
  else:
    break
read lines and print everthing from here

On the assumption that read line function records how many lines it has read and which line is about to get read.

George Y
  • 525
  • 3
  • 14