2

I have a special mbox file where each message in the mbox is has one or more attached messages. These were created from being forwarded as an attachment. I have a perl script that uses MIME::Tools and MIME::Parser to parse that mbox file. It can pipe these files to another program (spamassassin), or save them as individual text files which is what I'm doing. I believe these individual files are RFC822 format (not positive). Each text file does not start with "From: ..." so I can't simply cat them back together.

I need a way to reassemble these extracted files back into mbox (mbxcl2) format. Is there a tool or script I can use to reassemble these extracted files?

I tried having my script output them into a single file as they parsed, with a From me\@myserver.com Fri Sep 1 15:18:53 2017\n. This is enough for mailx viewing on the server, but Dovecot complains: dovecot: imap(me): Error: Syncing INBOX failed: Mailbox isn't a valid mbox file

So I apparently need to do more than just add the "From " separator.

melpomene
  • 84,125
  • 8
  • 85
  • 148
shorton
  • 323
  • 3
  • 13
  • A quick search reveals https://wiki2.dovecot.org/MailboxFormat/mbox - it mentions separating messages via Content-Length headers ( see "Escapting From" ). Have you investigated that? – bytepusher Sep 02 '17 at 16:17
  • Yeah, that's part of mbxcl2 mentioned in the original Q. Looking for a already available tool to put these back together. – shorton Sep 02 '17 at 20:03

1 Answers1

0

Originally I was writing \n\nFrom me...\n" To ensure the required blank line in front of each From... I think the initial blank line is what Dovecot was unhappy with.

I rewrote it so as the original parsing script was breaking the message attachments out, I added the 2 lines below (before and after the line that was writing the individual messages). So now it did not start off with a blank line.

print OUT "From me\@myserver.com  Fri Sep  1 15:18:53 2017\n";
$ent->bodyhandle->print(\*OUT);
print OUT "\n\n";

OUT is the resulting mbx file. Since the original messages had the content-length header, at least Dovecot and Outlook are happy with the resulting format. So I'm good now I think.

shorton
  • 323
  • 3
  • 13