Python email header decoder for python2.7 or python3 seems to have some strange behavior in switching between encoded and unencoded text.
from email.header import decode_header
print decode_header("=?ISO-8859-1?B?QA==?=example.com");
print decode_header("=?ISO-8859-1?B?QA==?= example.com");
print decode_header("=?ISO-8859-1?Q?=40example?= .com");
print decode_header("=?ISO-8859-1?Q?=40example?=.com");
Here is the result
[('=?ISO-8859-1?B?QA==?=example.com', None)]
[('@', 'iso-8859-1'), ('example.com', None)]
[('@example', 'iso-8859-1'), ('.com', None)]
[('=?ISO-8859-1?Q?=40example?=.com', None)]
In all the example inputs the encoded-text is just @ sign and it should get interpreted properly but it does not. I think the interpretation of RFC 1342 seems incorrect to me. Python expects a space or newline to be the end of an encoded text. I don't see this in the RFC, RFC only says space is needed between multiple encoded-texts as I read it and not between encoded-text and unencoded portions of the text. So whenever you see "?=" you need to treat that as the end of encoded text which python does not do. I want to ask the experts if this is a bug here OR if I got this wrong?
Vijay