htmlentities before inserting to database

Question

I want to get my website safe against XSS attacks.

I have one module which allows the user to insert a text (with special chars) into the database. This text is displayed on the start page.

Now my question: Should I use htmlentities($_POST["userinput"], ENT_QUOTES, 'UTF-8') before inserting the userinput into the database?

Or can I insert the userinput directly into the database and just display it with htmlentities?

You shouldn't be manipulating the original data. It would be better to manipulate data when working on it, for example when displaying it on the screen or doing calculations. — LaVomit, Sep 02 '15 at 12:40
The script just inserts the userinput into the database. On another site, this userinput will be displayed -> no working on it. So htmlentities on the output is safe? — Bernd, Sep 02 '15 at 12:44
I don't know your plans exactly, but no matter what data, you should always want to keep track of the original data. So I suggest you'd use `htmlentities` when you are going to display it. Also make sure everything is UTF-8 here, also the storage in your database, because else special characters could come out unclear. This can be fixed by using `utf8_encode` and/or `utf8_decode` but obviously it's better to make sure you set up everything right. — LaVomit, Sep 02 '15 at 12:49

score 6 · Accepted Answer · answered Sep 02 '15 at 13:49

Some people argue that you should sanitize against XSS on input and output. I think this is not really very valuable. For one, it only really matters that you do it on output, since that's where the vulnerability exists that you are trying to mitigate. Any solution that relies on treating the stuff coming from the database as trusted input is broken in my opinion.

The issue is that somewhere down the line, you (or the person who comes after you) may decide they need to insert the data differently - some external API - who knows. The point is, now your page has a security vulnerability in it, because you decided to trust data from the database.

The argument against doing it on the way in and the way out for me is two parts:

You aren't adding any additional security, so you are really only making people feel like it is twice as safe - this is not a feeling we ever want to create. We want to prove something is safe, not just make it feel double safe.
You also may write a bug that screws up the original data. If this happens when you are rendering it, it's not as big of a deal, because you can fix it and show it correctly. If it happens when you store it, then that data is irrecoverable.

I have the problem with UTF-8. When I insert the htmlentities-data into the db, I get codes of the special chars in the database. And if I try to output them I get these codes again. And I didn't want to replace each code with the special char but I think thats how I have to do it if I want "it twice as safe" — Bernd, Sep 02 '15 at 19:26
I think you'd be ok not encoding the data upon insert, and instead do it only when you display it. That's the idea I wanted to convey at least. — Gray, Sep 02 '15 at 19:29

htmlentities before inserting to database

1 Answers1