0

I am creating a website where you "post", and the form content is saved in a MySql database, and upon loading the page, is retrieved, similar to facebook. I construct all the posts and insert raw html into a template. The thing is, as I was testing, I noticed that I could write javascript or other HTML into the form and submit it, and upon reloading, the html or JS would treated as source code, not a post. I figured that some simple encoding would do the trick, but using <form accept-charset="utf-8"> is not working. Is there an efficient way to prevent this type of security hole?

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
Wiz
  • 4,595
  • 9
  • 34
  • 51
  • 3
    Yes. Don't save HTML in the database. – JJJ Jul 07 '12 at 21:11
  • 1
    Those types of attacks are called XSS (cross-site scripting). – dAm2K Jul 07 '12 at 21:13
  • @Juhana I'm not saving the HTML, I'm saving what I get from the form post, which is usually just some text. But I was thinking about an attacker, writing JavaScript into the form, and submitting it, and the HTML generated would include the JS. – Wiz Jul 07 '12 at 21:15
  • 6
    In other words you're saving HTML in the database. – JJJ Jul 07 '12 at 21:16
  • 1
    Do you WANT your users to be able to use HTML in your posts? If you do, then you have a problem. You'll have to find a secure way to ensure that they use the tags you want them to use and don't use the tags you don't, and that is complex and not very safe. If you don't, then strip HTML from the posts altogether. – Andrew Gorcester Jul 07 '12 at 21:18

5 Answers5

8

Well, for completeness of the picture I'd like to mention that there are two places where you can sanitize user input in Pyramid - on the way in, before saving data in the database, and on the way out, before rendering the data in the template. Arguably, there's nothing wrong with storing HTML/JavaScript in the database, it's not going to bite you - as long as you ensure that everything that is rendered in your template is properly escaped.

In fact, both Chameleon and Mako templating engines have HTML escaping turned on by default, so if you just use them "as usual", you'll never get user-entered HTML injected into your page - instead, it'll be rendered as text. Without this, sanitizing user input would be a daunting task as you'd need to check every single field in every single form user enters data into (i.e. not only "convenient" textarea widgets, but everything else too - user name, user email etc.).

So you must be doing something unusual (or using some other template library) to make Pyramid behave this way. If you provide more details on the templating library you're using and a code sample, we'll be able to find ways to fix it in a proper way.

Sergey
  • 11,892
  • 2
  • 41
  • 52
  • Have to say that it's the most clever here. Filtering output is one reason to use template engine. No need to worry unless you disable filtering. Save everything as is but escape everything and if needed, use custom filters. (like a filter that allow only span with the class attribute) – Loïc Faure-Lacroix Jul 08 '12 at 14:46
4

The type of attack you are describing is called a "Javascript Injection Attack" or "Cross Site Scripting (XSS) attack." You may have more luck searching for that. Sanitising user input using Python is a similar question that includes some pretty comprehensive answers, although the best is probably here

Community
  • 1
  • 1
Pete Baughman
  • 2,996
  • 2
  • 19
  • 33
  • Not. It's called XSS (cross-site scripting). SQL Injection is another thing. – dAm2K Jul 07 '12 at 21:18
  • 1
    @dAm2K It's certainly not SQL Injection, but it wouldn't be considered Javascript Injection? – Pete Baughman Jul 07 '12 at 21:19
  • 1
    yes, it's a sort of javascript injection. But if you google for "Injection Attack" you'll find SQL injection instead of xss. – dAm2K Jul 07 '12 at 21:22
1

You can replace the <script>, <iframe> tags with something else or you can html encode the strings so that it appears as text on the page but is not rendered by the browser itself.

Doing a string replace of all <'s and >'s with the &lt and &gt should be more than sufficient at preventing the XSS you are seeing as well.

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
sean
  • 3,955
  • 21
  • 28
1

I found this one, that talks about XSS prevention in python.

Python library for XSS filtering?

Community
  • 1
  • 1
dAm2K
  • 9,923
  • 5
  • 44
  • 47
0

It's a little out there, but you could write a block of code to recognize certain key aspects of html/javascript codes and act accordingly. recognize the block, for example, and either not allow that query to be passed or edit it so it's no longer valid html....

Aaron Tp
  • 353
  • 1
  • 3
  • 12
  • 1
    And then browsers start support another feature, and you didn't account for it in your code, and there is another security hole. Luckily, just escaping *all* html special characters (specifically the angle brackets) should be enough to prevent future features from becoming a problem.. – Martijn Pieters Jul 08 '12 at 11:41