2

google-code-prettify handles HTML escapes in code blocks by 'prettifying' the escape characters themselves, rather the escaped sequence. For example the:

original HTML <code class="prettyprint lang-sql"> ... &gt; ... </code> gets prettified into:

<span class="pun">&amp;</span><span class="pln">gt</span><span class="pun">;</span>

with obvious wrong rendering. I can't return unescaped HTML inside <code> as is not from trusted source and can be used as an XSS vector.

My question is if there is is any way to coerce google-pretty-print into doing the right thing and consider the content of <code> as HTML (escaped), not as raw text.

Remus Rusanu
  • 288,378
  • 40
  • 442
  • 569

1 Answers1

1

My question is if there is is any way to coerce google-pretty-print into doing the right thing and consider the content of <code> as HTML (escaped), not as raw text.

It should, and it does. The C example demonstrates this where

#include &lt;stdio.h&gt;

is prettified to

<span class="pln">
</span><span class="com">#include</span><span class="pln"> </span><span class="str">&lt;stdio.h&gt;</span><span class="pln">

and I pasted your example code into the test page and got

<span class="pln"> </span><span class="pun">...</span><span class="pln"> </span><span class="pun">&gt;</span><span class="pln"> </span><span class="pun">...</span><span class="pln"> </span>

I think your problem is probably a result of some other layer. I would look into any content management system, templates, or blog publishing software that sees your HTML before the browser.

Mike Samuel
  • 118,113
  • 30
  • 216
  • 245