1

In order to avoid Postfix from wrapping ultra long lines after smtp_line_length_limit (usually 998 characters) I am currently using php's tidy library to wrap long lines in HTML emails (see related question):

$oTidy = new tidy();
$message = $oTidy->repairString($message,
    array("show-errors" => 0, "show-warnings" => false, "force-output" => true, 
    "alt-text" => "Please display images", "bare" => true, "doctype" => "auto", 
    "drop-empty-paras" => false, "fix-bad-comments" => false, "fix-uri" => true, 
    "join-styles" => false, "merge-divs" => true, "merge-spans" => true, 
    "preserve-entities" => true, "wrap" => 68),
    "utf8"
); 

Tidy is really good in wrapping long lines with respect to leaving HTML and CSS valid.
Unfortunately it does a lot more like trying to fix invalid HTML markup, changing HTML tags, doctypes, etc.

I only need the line wrapping however - the other stuff tidy does is overhead and sometimes rather annoying than anything else.

Now I have tried to use PHPMailer's wrapText() function. Unfortunately I have found a bug that makes it useless for me.
PHPMailer converts this source code

<html>
    <body>
        Loremipsumdolorsitametconsetetursadipscing<span style="font-family:'Courier New',sans-serif">lorem</span>
    </body>
</html>

to

<html>
    <body>
        Loremipsumdolorsitametconsete<span style="font-family:'Courier
        New',sans-serif">lorem</span>
    </body>
</html>

breaking the font formatting (Courier New) for the word lorem in some clients.

Now my questions:

How can I wrap HTML lines safely without damaging HTML and CSS?

How does Tidy do it? Should I use a DOM parser? Is there a php version of the Tidy source code (I haven't found one)?

Community
  • 1
  • 1
Horen
  • 11,184
  • 11
  • 71
  • 113
  • wordwrap()? as long as you don't allow word breaks, then the html/css should come through just fine. unless, of course, you have some ridiculously long css class names. – Marc B Jul 31 '13 at 15:02

2 Answers2

3
  1. encode your text into base64 using base64_encode()
  2. set appropriate MIME header
  3. split this base64-ed blob into 76 character wide chunks using chunk_split()
  4. Profit!
Andrew Theken
  • 3,392
  • 1
  • 31
  • 54
Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
  • How much longer will my source code be after encoding it in base64? – Horen Jul 31 '13 at 15:10
  • 3. split this base64-ed blob into 80 caracter wide chinks – Your Common Sense Jul 31 '13 at 15:16
  • Sizewise I mean. It will be base64 encoded so the total size of the file will be bigger, right? I understand that each line will only be 80 characters long. – Horen Jul 31 '13 at 15:18
  • Right. Though, as this problem doesn't bother not a single email service or client software (as every one of them do encode their emails in base64), I see no reason for this question at all – Your Common Sense Jul 31 '13 at 15:20
  • Base64 encoding is not used all the time. Mainly because of Spam reasons. The receiving MTA cannot content filter right away. `quoted-printable` might be a solution though. I'll look into it. – Horen Jul 31 '13 at 15:32
  • Errr.... spam? Do you really believe that "The receiving MTA cannot content filter [spam] right away"? Anyway, you can use quoted printable as well – Your Common Sense Jul 31 '13 at 15:34
2

The best way to go seems to be quoted-printable encoding as it can break lines into small chunks of characters while preserving readability for content filters without the risk of destroying any formatting.

Base64 would also be an option but would raise the risk of spam classification.

Both options increase the length of the source code however (quoted-printable especially for non-ascii characters).

Side note:
PHPMailer's wrapText() will not be fixed as the described problems can be solved through the mail encoding as described above.

Horen
  • 11,184
  • 11
  • 71
  • 113