-2

I am trying to print pieces of an email (from, subject, body) with Python 3.5. I get a weird index error:

Traceback (most recent call last):
File "/home/user/PycharmProjects/email/imaplib_fetch_rfc822.py", line 21, in <module>
subject_, from_, to_, body_, = ' ' + email[0], email[1], email[2], 'Body: ' + email[4]
IndexError: list index out of range

I am not sure what is causing this, though I suspect that my tangle of fors and splits is the culprit. I am a noobie coder, so I imagine what I'm doing is far from elegant:

import imaplib
import imaplib_connect

with imaplib_connect.open_connection() as c:
    c.select('INBOX')
    typ, [msg_ids] = c.search(None, 'TEXT', 'Sunday')
    for num in msg_ids.split():
        typ, msg_data = c.fetch(num, '(RFC822)')
        for raw_email in msg_data:
            # raw_email is a tuple of len==2, we need the 2nd item:
            b = raw_email[1]
            email = str(b).split('\r\n')
            subject_, from_, to_, body_, = ' ' + email[0], email[1], email[2], 'Body: ' + email[4]
            print(subject_, '\n',
                  from_, '\n',
                  body_, '\n')

I don't understand what's the problem. One thing to note is that for the email parts email[0], email[1], email[2], and email[4], is that I skipped email[3] because it prints some random email meta data junk. Can anyone see what I'm doing wrong? And what can I do to remedy the error?

Joansy
  • 159
  • 12
  • 3
    Can you show the code that's actually causing the error? The traceback shows you exactly what line and file it's happening in. Also, because it is driving me crazy, you do realize `for i in range(0, 1)` is a loop of **length 1** right? – Two-Bit Alchemist Jul 13 '16 at 02:31
  • 2
    For what it's worth, you don't reference `i` in your `for i in range(0, 1)` loop. Not to mention that `range(0, 1)` is essentially just referencing 0, since the 1 is excluded in the range. – Matt Cremeens Jul 13 '16 at 02:32
  • @Two-Bit Alchemist I changed my post to include where the code is causing the error, it's on the `c=[1]` line. Thanks for pointing that out! Yes, my loop is ...odd... I don't know how else to write it, as just `for i in [0]`? – Joansy Jul 13 '16 at 02:59
  • What is `a.split(", b'", maxsplit=1)` about? It looks like you are trying to stplit the printout of a list of bytes. – Klaus D. Jul 13 '16 at 03:43
  • Your error and code don't match up. Also, throw in a bunch of prints to output the list you are indexing and see do you even have enough elements to index. – Steven Summers Jul 13 '16 at 04:06
  • @StevenSummers Can you explain what you mean by the error and code not matching up? When I print out `b` I get 2 items in the list. I then ditch the 1st item and break up the 2nd which I name `email`. When I print `email` I get 4 items in a list. – Joansy Jul 13 '16 at 04:54
  • @KlausD. When I get the email message directly from imap, it's returned in binary and nestled in a bunch of gibberish server-speak. I isolated the email message, then I wanted to remove the ugly bits so that it's more readable for me. By itself, it looks like `(b'5 (RFC822 {173}', b'Subject: Want to go to the beach Sunday?\r\nFrom: you@email.net\r\nTo: me@email.net\r\n\r\nI had a great day at the lake yesterday. I hope you can make it Sunday?')` So I split the email everywhere there was a ", b'" to break it into pieces (to, from, subject, body). – Joansy Jul 13 '16 at 04:59
  • It is a tuple and split already, you basically put it together and split it again. Try `your_data[1].decode()` there . `decode()` might need the encoding as argument if the email is encoded. – Klaus D. Jul 13 '16 at 05:04
  • @KlausD. Can you show an example? I'm not sure what you mean. What do you mean put it together and split it again? Where do I do that? – Joansy Jul 13 '16 at 05:23
  • What I meant was your error message says `c = c[1]` but that's not in the code. If email has 4 items then `body_ = ('Body: ' + email[4])` this should give index error as well and should be [3] – Steven Summers Jul 13 '16 at 07:07
  • `c = c[1]` (from your traceback) literally is **not** in your code, so, either you've provided an incorrect traceback, or you've not shown your actual code. In either case, it's very difficult to assist if you can't accurately convey the problem and its source. I'd also note that this is potentially problematic `c = b[1]` as you've previously assigned `c` as the `imaplib_connect.open_connection()` so overwriting it as `b[1]` could be problematic. – David Zemens Jul 13 '16 at 12:21
  • @StevenSummers If I change `body_ = ('Body: ' + email[4])` to `email[3]` that is not correct because it would print nothing. There are 5 items in the list email, and that one is a blank space. So I want to skip over it. – Joansy Jul 13 '16 at 15:33
  • I incorrectly said email had 4 last night, but it actually has 5. – Joansy Jul 13 '16 at 15:40
  • @DavidZemens I just changed the c's to y. Thank you for pointing that out. I also fixed the c=c[1], that was a typo in the traceback. Arrr. – Joansy Jul 13 '16 at 15:43
  • Please double-check *again*, when you say: `y=[1]` to `y=[-1]` and `y[0]`, do you mean: `y=b[1]` to `y=b[-1]` and `y=b[0]`? Because if that's not what you mean, then you still hve the problem of referencing portions of code which literally do not exist in the code as you have provided it. – David Zemens Jul 13 '16 at 15:52
  • Also add a print statement to query the value of `a` before the error occurs. This is like, basic debugging, ensuring that what you're processing is the type of data that you expect it to be... obviously it is not, so let's see why not? – David Zemens Jul 13 '16 at 15:56
  • @DavidZemens Just fixed it! Yes, what I meant was y=b[1], y=b[-1], and y=b[0]. Sorry about that, not trying to be difficult. Thanks for helping me make my question better. :) – Joansy Jul 13 '16 at 15:57
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/117235/discussion-between-david-zemens-and-joansy). – David Zemens Jul 13 '16 at 15:57

1 Answers1

1

You've got a few issues with your indexing/splitting. I think you don't need to convert raw_email from a tuple to string, just process the tuple.

Then, your split('\\r\\n') is incorrect, it should be split('\r\n')

I think this will resolve it:

    for raw_email in msg_data:
        # raw_email is a tuple of len==2, we need the 2nd item:
        if len(raw_email) >= 2:
            b = raw_email[1]
            email = b.decode('utf-8').split('\r\n')
            subject_, from_, to_, body_, = ' '+email[0], email[1], email[2], 'Body: ' + email[4]
            print(subject_, '\n',
                  from_, '\n',
                  body_, '\n')
        else:
            print("This raw_email was not processed: "+ raw_email)
            # Use this for debugging if the raw_email is diff't length

This was tested in python 2.7, but seems to be working:

enter image description here

NOTE: In python 3.x you need to use b.decode('utf-8') which isn't necessary in python 2.

This should work assuming your emails always result in email object with length == 5.

David Zemens
  • 53,033
  • 11
  • 81
  • 130
  • This is assuming a lot of about the format of the message. Hopefully user can eventually move to using the MIME Parser in python. – Max Jul 13 '16 at 16:29
  • @Max assumptions based on sample data obtained via Chat. – David Zemens Jul 13 '16 at 16:44
  • @Max I looked into MIME Parser. I think it's deprecated in Python 3.5 now. Instead they want you to use email.parser. – Joansy Jul 13 '16 at 16:48
  • 1
    @DavidZemens This adjusted code prints the email pieces AND doesn't return an index error. This is what I am looking for. Thank you! – Joansy Jul 13 '16 at 17:16