2

Rails's built-in HTML sanitizer (which uses the gem Loofah) adds newlines between <ul> and <li> tags. I want to display the sanitized content with white-space: pre-wrap; because it comes from a WYSIWYG editor, but the extra newlines make the output look wrong. Desired on top, actual on bottom, with background color added to ul for emphasis:

Desired versus actual output of html sanitizing

Here's what happens when I run some code through the sanitize in the rails console:

2.2.2 :033 > input = "<ul><li>a</li><li>b</li></ul>"
 => "<ul><li>a</li><li>b</li></ul>"
2.2.2 :034 > WhiteListSanitizer.new.sanitize(input)
 => "<ul>\n<li>a</li>\n<li>b</li>\n</ul>"

And if I make a Loofah object and convert it to html without scrubbing, it still adds newlines.

2.2.2 :035 > Loofah.fragment(input).to_html
 => "<ul>\n<li>a</li>\n<li>b</li>\n</ul>"

How do I make it leave the whitespace alone?

I can strip out the line breaks with regex if absolutely necessary but it seems strange that there's no option to disable this behavior.

Tom Smilack
  • 2,057
  • 15
  • 20

3 Answers3

3

You need to exclude FORMAT option from DEFAULT_HTML option (both belong to Nokogiri):

input = "<ul><li>a</li><li>b</li></ul>"
disable_formatting = Nokogiri::XML::Node::SaveOptions::DEFAULT_HTML ^ Nokogiri::XML::Node::SaveOptions::FORMAT
Loofah.fragment(input).to_html(save_with: disable_formatting)
# => "<ul><li>a</li><li>b</li></ul>"
Alexey Shein
  • 7,342
  • 1
  • 25
  • 38
2

Looks like Loofah is actually using Nokogiri behind the scenes, which defaults to display formatted/indented html -- hence the new line characters. I found another (similar) question regarding this behavior of Nokogiri, and below, is code I believe will work in your case:

Loofah.fragment(input).to_html(:save_with => 
Nokogiri::XML::Node::SaveOptions::AS_XML | 
Nokogiri::XML::Node::SaveOptions::NO_DECLARATION).strip
Community
  • 1
  • 1
Joel Brewer
  • 1,622
  • 19
  • 30
1

Instead of Rails's built-in HTML sanitizer

ActionController::Base.helpers.sanitize('<script/><ul><li>111</li><li>222</li></ul>', tags: %w(p br ul ol li a), attributes: %w(href))
#=> "<ul>\n" + "<li>111</li>\n" + "<li>222</li>\n" + "</ul>"

Loofah can be used directly

loofah_fragment = Loofah.fragment('<script/><ul><li>111</li><li>222</li></ul>')
scrubber = Rails::Html::PermitScrubber.new
scrubber.tags = %w(p br ul ol li a)
scrubber.attributes = %w(href)
unformatted_html = Nokogiri::XML::Node::SaveOptions.class_eval { |m| m::DEFAULT_HTML ^ m::FORMAT }
loofah_fragment.scrub!(scrubber).to_html(save_with: unformatted_html)
#=> "<ul><li>111</li><li>222</li></ul>"
Lev Lukomsky
  • 6,346
  • 4
  • 34
  • 24