0

Is it possible to read the emails based on the Subject line and then get the base64 attachment or directly get the attachment ? Server : Linux System

Dhiraj Tayade
  • 407
  • 3
  • 10
  • 22
  • 2
    Take a look at the `formail` program. This question is probably more proper on the StackExchange sites [**Super User**](http://superuser.com/) or [**ServerFault**](http://serverfault.com/) – David C. Rankin Sep 19 '16 at 05:41
  • Is it possible to only use the command line and not use additional utilities ? – Dhiraj Tayade Sep 19 '16 at 05:42
  • 1
    Not that I know of based on the way messages and attachments are uuencoded and included as separate parts of the same file. (although there are several formats). You need something that can extract and uudecode the attachments so you can read them. (unless you read 7-bit ASCII) You can string a line of utilities together to do it in a one-lines, but better to just use the utility that was written to do just that. – David C. Rankin Sep 19 '16 at 05:44
  • Cross-site duplicate: http://superuser.com/questions/406125/utility-for-extracting-mime-attachments – tripleee Sep 19 '16 at 06:07
  • @DavidC.Rankin This clearly asks for MIME decoding, not prehistoric uuencode, – tripleee Sep 19 '16 at 06:31
  • My apologies, that is just my vernacular for mail encoding. I understood the current context. (I do note, your utility of choice was `formail` as well) – David C. Rankin Sep 19 '16 at 06:32
  • uuencode was supposed to be phased out when MIME was introduced. RFC2045 is from 1996 so there's been plenty of time to adapt, but some strongholds are still resisting. The `base64` encoding was introduced with MIME so its presence in the question is a strong hint, though not exactly proof, that the OP needs MIME support. – tripleee Sep 19 '16 at 06:55
  • `formail` is good for getting the Subject: header but doesn't know anything about MIME. You could get the Subject: header with a `sed` script, although it will need to be slightly more involved than the most naïve attempt (headers can be folded across multiple lines, etc). – tripleee Sep 19 '16 at 07:00
  • Possible with Mutt ? – Dhiraj Tayade Sep 19 '16 at 07:22

1 Answers1

0

Your question seems to presuppose that there is a single attachment and that it can be reliably extracted. In the general case, an email message can have a basically infinite amount of attachments, and the encoding could be one out of several.

But if we assume that you are dealing with a single sender which consistently uses a static message template where the first base64 attachment is always going to be the one you want, something like

case $(formail -zcxSubject: <"$message") in
    "Hello, here is your report for "*)
        awk 'BEGIN { h=1 }
            h { if ($0 ~ /^$/) h=0 ; next }  # skip headers
            /^Content-Disposition: attachment/ { a=1 }  # find att
            a && /^$/ { p=1; next }
            p && /^$/ { exit }
            p' "$message" |
        base64 -d ;;
esac

This will extract the Subject: header and compare it to a glob pattern. I expect this is what you mean by "based on subject" -- if we find a matching subject header, examine this message, otherwise discard.

The crude Awk script attempts to isolate the base64 data and pass it to base64 -d for extraction. This contains a number of pesky and somewhat crude assumptions about the message format, and probably requires significant additional tweaking. Briefly, we skip the headers, then look for MIME headers identifying an attachment, and print that, skipping everything else in the message. If this header is missing, or identifies the wrong MIME part, you will get no results, or (worse) incorrect results. Also, the /^Content-Disposition:/ regex could theoretically match on a line which is not a MIME header, though this seems highly unlikely (but might actually happen if you are looking e.g. at a bounce message).

A more robust approach would involve a MIME extraction tool or perhaps a custom script to actually parse the MIME structure and extract the part you want. Without details about what exactly you need, I'm not able to provide that. (This would also allow you to use the sender's specified filename; the above script simply prints the decoded payload to standard output.)

Note also that formail has no idea about RFC2047 encoding, so if the subject is not plain ASCII, you have to specify the encoded form in the script.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • Hi, The crude Awk script attempts to isolate the base64 data and pass it to base64 -d for extraction. - how would it be done. I am able to extract the sender, subject and attachment name. I am interested in extracting base64 data . – Dhiraj Tayade Sep 19 '16 at 07:24
  • I don't understand your question. Do you need more details about what the Awk script does, or do you want a script with more or different features? – tripleee Sep 19 '16 at 07:55
  • I want to understand how to read the /var/spool/mail/user file and then extract the details for the attachment based on the different subjects. – Dhiraj Tayade Sep 20 '16 at 06:46
  • You are still not revealing what "based on the different subjects" actually means. The spool file is an `mbox` file which contains multiple messages. You can run the above script on each of them with `formail -s path/to/script <"$MAIL"` – tripleee Sep 20 '16 at 06:51
  • ... The above script expects `message` to point to the file name of a message, but `formail` will feed each message as standard input. Maybe refactor the code above to save the incoming message to a temporary file, and remove it when done or interrupted (hint: [cascading traps](http://stackoverflow.com/a/14275357/874188)), add a shebang, save, mark as executable, etc. – tripleee Sep 20 '16 at 07:04
  • Maybe vaguely see also http://stackoverflow.com/questions/31915712/procmail-giving-no-match-on-content-type -- not an excellent question, but it could give you some ideas. – tripleee Sep 20 '16 at 07:07