1

Consider this php code:

$content = '<img />'."\n\n".'<img />';
$doc = new DomDocument();
$doc->loadHTML($content);
echo $doc->saveHTML();

The output (including the wrapping HTML/body etc) gives me:

<img><img>

with no space between the images.

Calling:

$doc->preserveWhiteSpace = true;

doesn't change anything.

How do I preserve the white space in the original HTML?

Stephen
  • 160
  • 2
  • 10

2 Answers2

2

Answering my own question - it's buggy behaviour of an old version of libxml2:

https://bugs.php.net/bug.php?id=50278

This issue is solved by passing LIBXML_HTML_NODEFDTD as option when loading the document. This constant is available as of PHP 5.4.0 when libxml2 >= 2.7.8 is used. See http://3v4l.org/qs4TC.

The shared server I'm on uses 2.7.6 so not sure it helps me, but I can see if they can upgrade. Hope this helps someone else.

Stephen
  • 160
  • 2
  • 10
  • All sorts of whitespace-only text nodes are removed when this bug is present, despite `DOMDocument::preserveWhiteSpace` being `true`. This solution fixes it (I have libxml 2.9.4 and PHP 7.1.9, Windows x64). – Jake Jul 26 '18 at 16:20
-1

White spaces are collapsed in HTML.

You can learn about how to add white spaces using tutorials such as this one:

http://www.wikihow.com/Insert-Spaces-in-HTML

but in your case, instead of using "\n\n" use "<br><br>" and you will get your "line break" characters.

All HTML Entities (that are explained in that tutorial linked above) can be found in this reference page:

https://dev.w3.org/html5/html-author/charref

Please note: You should really be using CSS for the visual aspects on your page so if you are just wanting spacing between your images then use things like margins rather than HTML tags.

https://developer.mozilla.org/en/docs/Web/CSS/margin

Ruben Funai
  • 629
  • 5
  • 5
  • I'm aware how HTML works. This is to do with how DomDocument works. If you use, for example, `'test'."\n\n".'test'`, DomDocument preserves the line breaks. – Stephen Apr 19 '16 at 05:02
  • Right so now I understand what you are saying. DomDocument interprets a string with HTML and creates objects of nodes. So all the formatting of your HTML string is ignored when inputting it into DomDocument. The preserveWhite property is used for text nodes to avoid the collapse of ` apple ` into `apple` but it is not designed to keep your decorative new line characters. Please note that there are circumstances when preserveWhitespace is ignored http://stackoverflow.com/questions/9972112/using-phps-domdocumentpreservewhitespace-false-and-still-getting-whitespace – Ruben Funai Apr 19 '16 at 05:32
  • Just in case you goal is to make the code more appealing you may be interested in the formatOutput property http://php.net/manual/en/class.domdocument.php#domdocument.props.formatoutput – Ruben Funai Apr 19 '16 at 05:36
  • This is not just about decoration. Image tags with no whitespace between them will display as touching visually. Image tags with at least one whitespace character will display with space between them. – Stephen Apr 19 '16 at 05:49
  • Also, preserve whitespace preserves non-text nodes as well. For example, try: `$content = '
    '."\n\n".'
    ';` Further testing suggests it just seems to be buggy with self-closing tags.
    – Stephen Apr 19 '16 at 05:49