Render span-level string using Kramdown

Question

I know that I can parse and render an HTML document with Kramdown in ruby using something like

require 'kramdown'

s = 'This is a _document_'
Kramdown::Document.new(s).to_html
# '<p>This is a <i>document</i></p>'

In this case, the string s may contain a full document in markdown syntax.

What I want to do, however, is to parse s assuming that it only contains span-level markdown syntax, and obtain the rendered html. In particular there should be no <p>, <blockquote>, or, e.g., <table> in the rendered html.

s = 'This is **only** a span-level string'
# .. ??? ...
# 'This is <b>only</b> a span-level string'

How can I do this?

So you want to strip out all block-level elements? This is the default behavior of kramdown. See http://kramdown.gettalong.org/options.html — Mark Thomas, Aug 05 '14 at 14:04
That's also what I read, but the output still contains the `p`'s. Haven't figured out how to get kramdown to actually remove those. — Juan A. Navarro, Aug 05 '14 at 14:14
It appears that option is for parsing raw HTML; it doesn't have an effect on the output. The output is not changeable, as they aim to be consistent with other Markdown implementations. You'll probably have to post-process. — Mark Thomas, Aug 05 '14 at 15:03

Mark Thomas · Answer 1 · 2014-08-05T18:25:59.913

2

I would post-process the output with the sanitize gem.

require 'sanitize'

html = Kramdown::Document.new(s).to_html
output = Sanitize.fragment(html, elements:['b','i','em'])

The elements are a whitelist of allowed tags, just add all the tags you want. The gem has a set of predefined whitelists, but none match exactly what you're looking for. (BTW, if you want a list of all the HTML5 elements allowed in a span, see the WHATWG's list of "phrasing content").

I know this wasn't tagged rails, but for the benefit of readers using Rails: use the built-in sanitize helper.

edited Aug 05 '14 at 18:25

answered Aug 05 '14 at 17:16

Mark Thomas

37,131
11
74
101

1

I would rather *not add* the additional markup than have it removed. But if there is no other simple solution, I might just do this. – Juan A. Navarro Aug 06 '14 at 07:57
For security purposes, whitelists are preferred over blacklists. This is particularly a concern if the content is end-user created and the application generates public pages. – Mark Thomas Aug 06 '14 at 14:04
Sure, I always keep that in mind. But, in my case, the content is created by myself, not an end-user. Sanitation (somewhat) does what I want as a side effect, but my end goal here is not sanitation. – Juan A. Navarro Aug 06 '14 at 14:09

score 1 · Answer 2 · answered May 26 '15 at 20:18

You can create a custom parser, and empty its internal list of block-level parsers.

class Kramdown::Parser::SpanKramdown < Kramdown::Parser::Kramdown
  def initialize(source, options)
    super
    @block_parsers = []
  end
end

Then you can use it like this:

text = Kramdown::Document.new(text, :input => 'SpanKramdown').to_html

This should do what you want "the right way".

Render span-level string using Kramdown

2 Answers2