1

I'm trying to implement some sort of spamreport for my Mail Server. Mails are sorted by sieve and are all in one folder called Spam. I loop through the folder with bash. With this i get the necessary informations from the mail:

cat $f | grep '^From' | head -n1 >> $TMPFILE
cat $f | grep '^Subject' | head -n1 >> $TMPFILE

but in some mails the subject is encoded like this

Subject: =?ISO-8859-1?Q?Test:_Jaguar_XKR-S:_Unter_dem_Blech_lauert_d?=

How i can get the subject in correct encoding? I tried to use mail, mailx, mutt... but no one was able to simply load a mail from file.

Max
  • 502
  • 2
  • 4
  • 14

1 Answers1

1

The encoding in the Subject line looks like MIME Words. One possible way to decode the data is write a perl script that uses the MIME::Words module. You could make the perl script a shell script and call it from your bash script.

convert_subject.sh:

 #!/bin/sh
 /usr/bin/perl -pe 'use MIME::Words(decode_mimewords); $_=decode_mimewords($_);'

Example of using the script:

$ echo "=?ISO-8859-1?Q?Test:_Jaguar_XKR-S:_Unter_dem_Blech_lauert_d?=" | sh convert_subject.sh

Which outputs:

Test: Jaguar XKR-S: Unter dem Blech lauert d
j.w.r
  • 4,136
  • 2
  • 27
  • 29
  • works pretty! one small thing, the german umlauts are not converted correct. Any ideas? :) _Subject: Nur fÃŒr kurze Zeit: 15 Euro Gutschein fÃŒr_ – Max Aug 20 '12 at 15:59
  • working fine, failure of the umlauts is caused by my formatting of the mail as html and not fault of your little script. **THANKS** – Max Aug 20 '12 at 16:03
  • Actually it looks like you have UTF-8 but viewing it in an ISO-8859-1 terminal (or other legacy 8-bit Western, could be Windows code page 1251 or etc). – tripleee Aug 21 '12 at 03:07
  • i added `"Content-Type: text/html; charset='utf-8'"` to the mail script, now the most umlauts are fine. just some here and there are now question mark in a box. any suggestions? my code: `( echo "Subject: Spam" echo "MIME-Version: 1.0" echo "Content-Type: text/html; charset='utf-8'" echo "Content-Disposition: inline" echo "
    "
      cat $TMPFILE
      echo "
    " ) | sendmail $i`
    – Max Aug 21 '12 at 08:23
  • i figured out that lines starting with `=?utf-8?Q?` are working, lines with `=?iso-8859-1?Q?` get ? instead of umlauts. – Max Aug 21 '12 at 12:35
  • working: `=?utf-8?Q?Top-_Reiseschn=C3=A4ppchen_+_Urlaubsgeld?=`. not working: `=?iso-8859-1?Q?K=F6rperliche_N=E4he?=`. executed over bash (echo "?..." | sh convert...) – Max Aug 21 '12 at 12:40