2

How to save an e-mail message to a file without loading into memory? I use

import poplib
pop_conn = poplib.POP3(servername)
pop_conn.user(username)
pop_conn.pass_(passwd)
msg = pop_conn.retr(msg_number)[1]
msg_text = '\n'.join(msg)
msg_file = open(msg_file_name, ,"wb")
msg_file.write(msg_text)
msg_file.close()

But message loaded into memory.

pradyunsg
  • 18,287
  • 11
  • 43
  • 96
agrynchuk
  • 4,597
  • 3
  • 17
  • 17
  • 2
    You can't? All operations in Python are basically X->Memory buffer->Disk.. – Torxed May 11 '13 at 15:06
  • I think that need to look in the direction of socket programming. But I don `t know exactly how to do it. – agrynchuk May 11 '13 at 17:36
  • again, `socket` -> `memory buffer` -> `disk`.. same thing there. From a assembly perspective everything is CPU and Memory calculations before anything else. What it basically comes down to is shifting around memory allocations bit by bit and tell the CPU to fetch stuff from memory down to other parts of the motherboard (for instance, the disk). You can not come around the memory.. I'm sorry.. – Torxed May 11 '13 at 19:22
  • @Tor, have you ever watched streaming video? How do you think it works? – alexis May 12 '13 at 09:34
  • @alexis it loads from the socket, into the memory.. difference is that it's recieved by the onboard CPU on the NIC, sent to the CPU for processing, redirected (if not stored for a breif moment in RAM) to the Graphics memory buffer and then rendered for your convenience.. still stored in memory. – Torxed May 12 '13 at 14:38
  • In the context of the question, "loading into memory" clearly means "loading into memory in its entirety"-- which can cause performance problems with very large emails. So no, it's not at all the same thing. The difference should be obvious from my answer. – alexis May 12 '13 at 14:52

1 Answers1

0

The python docs caution against using the POP3 protocol. Your mail server probably understands IMAP, so you can use IMAP4.partial() to fetch the message in parts, writing each part to disk immediately.

But if you have to use POP3, you're in luck: The POP3 protocol is line-oriented. Python's poplib library is pure python, and it's a trivial matter to add an iterator by looking at the source. I didn't bother to derive from the POP3 class, so here's how to do it by monkey-patching:

from poplib import POP3

def iretr(self, which):
    """
    Retrieve whole message number 'which', in iterator form.
    Return content in the form (line, octets)
    """    
    self._putcmd('RETR %s' % which)
    resp = self._getresp()  # Will raise exception on error

    # Simplified from _getlongresp()
    line, o = self._getline()
    while line != '.':
        if line[:2] == '..':
            o = o-1
            line = line[1:]
        yield line, o
        line, o = self._getline()

POP3.iretr = iretr

You can then fetch your message and write to disk one line at a time, like this:

pop_conn = POP3(servername)
...
msg_file = open(msg_file_name, "wb")
for line, octets in pop_conn.iretr(msg_number):
    msg_file.write(line+"\n")
msg_file.close()
alexis
  • 48,685
  • 16
  • 101
  • 161