2

I have worked the last 2 days on WMD and Markdown, and I don't find THE solution for stock data with security. I would like users to be able to post HTML/XML <code> (with WMD) on my site.

For the moment, I stock data in the Markdown format, but if I disable JavaScript the user can easily push XSS. If I strip_tags or html_entities all data I lose the user HTML/XML <code>. How can I do it?

In my opinion I must html_entities just the code between pre /pre, but how?! My data is in Markdown.

After, what can I do to forbid XSS attributes:

<img src="javascript:alert('xss');" />
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131

1 Answers1

2

To "clean" your HTML, you could use a tool like HTML Purifier

Basically, it allows you to specify which tags/attributes are allowed, an only keeps those.

It also produces valid (X)HTML code as ouput -- which is nice.

You can see on the demo page there is an example that is almost exactly the XSS you posted, btw ;-)

For instance, you can try with some HTML like this one :

test <img src="javascript:evil();" onload="evil();" /> 
test <img src="http://www.google.com/a.Png" /> test2

The output is :

test  test <img src="http://www.google.com/a.Png" alt="a.Png" /> test2

The img tag with XSS has not been kept ; the other one has ; and there's been an alt attribute added, to be standard-compliant.

It might not solve all your problems, but if you are giving users the possiblity to input HTML, is it definitly useful (would I dare saying "it's a must-have" ? )

Pascal MARTIN
  • 395,085
  • 80
  • 655
  • 663
  • thanks for u reply. I known html purifier and other "sanitizer". The real probleme is : how can I do for not loose the html basic layout (by wmd) and the html inserted by user in .. –  Jul 23 '09 at 20:14
  • 1
    Basic layout > if it's just some simple tags, you can define those as "allowed" for HTMLPurifier, so they are not removed. If you want to allow HTML inserted between special tags, take a look at this answer : http://stackoverflow.com/questions/1155443/process-a-block-of-html-ignoring-content-within-specific-tags/1155530#1155530 ; by mixing that and HTMLPurifier, you might get to what you want ? – Pascal MARTIN Jul 23 '09 at 20:59