I'm hosting my own linux mail server for my family. Yesterday my father lost all the mails in his Inbox folder. I'm still not sure whether it's due to a terrible user error or compromised password, but that's not the point here. Thanks to Murphy's law, I also had no backup (don't shoot, I created one just after) and I feel terribly bad for him. So my only option left is trying to recover the deleted emails from the partition.
I immediately took an image of the whole ext4 data partition on the server with "dd", and now I have an archive of several hundreds GB to deal with, which feels like a giant haystack. I'm wondering what is the best way to extract the emails from this image? I know the mails are there somewhere because when I grep for my dad's email, I got lots of matches like "To: dad@mydomain.com", and with -C option I see the other usual SMTP headers (From, Subject, Date, Message-Id, ...).
I first tried "foremost" with a custom format, but since a mail doesn't have a fixed size the results were not conclusive.
I also tried https://pypi.org/project/mail-parser/ but it seems it would need patching to do what I want (it expects a text file with just a mail in it, not a big raw file with lots of mails in it).
Do you know any other (free) tool or method to reconstruct the email files from this ext4 image with reasonable accuracy? Like explained, the tricky part is that unlike images or other formats, the mails are stored in plain text and don't contain directly the size, so I think this tool will have to be rfc822 aware at some point to do the parsing/extraction.