9

Basically on displaying data from MySQL database I have a htmlspecialchars() function below that should convert single and double quotes to their safe entity(s). The problem I'm having is on viewing source code, it is only converting < > & when I also need it to convert single and double quotes.

//sanitize data from db before displaying on webpage
function htmlsan($htmlsanitize){
    return $htmlsanitize = htmlspecialchars($htmlsanitize, ENT_QUOTES, 'UTF-8');
}

Then when I want to use for example I do:

htmlsan($row['comment']);

Can someone tell me why it's not converting single and double quotes?

UPDATE

What's strange is htmlsan() is used on comment in email and when I view source code of email it converts them, it seems that it won't convert the single/double quotes from the database on displaying on webpage. My database collation is also set to utf8_general_ci and I declare I am using utf8 on database connection etc.

halfer
  • 19,824
  • 17
  • 99
  • 186
PHPLOVER
  • 7,047
  • 18
  • 37
  • 54

5 Answers5

11

How are you exactly testing it?

<?php

//sanitize data from db before displaying on webpage
function htmlsan($htmlsanitize){
    return $htmlsanitize = htmlspecialchars($htmlsanitize, ENT_QUOTES, 'UTF-8');
}

var_dump(htmlsan('<>\'"'));

... prints:

string(20) "&lt;&gt;&#039;&quot;"

My guess is that your input string comes from Microsoft Word and contains typographical quotes:

var_dump(htmlsan('“foo”')); // string(9) "“foo”" 

If you do need to convert them for whatever the reason, you need htmlentities() rather than htmlspecialchars():

var_dump(htmlentities('“foo”', ENT_QUOTES, 'UTF-8')); // string(17) "&ldquo;foo&rdquo;"

Update #1

Alright, it's time for some proper testing. Type a single quote (') in your comment database field and run the following code when you retrieve it:

var_dump(bin2hex("'"));
var_dump(htmlspecialchars("'", ENT_QUOTES, 'UTF-8'));
var_dump(bin2hex($row['comment']));
var_dump(htmlspecialchars($row['comment'], ENT_QUOTES, 'UTF-8'));

It should print this:

string(2) "27"
string(6) "&#039;"
string(2) "27"
string(6) "&#039;"

Please update your question and confirm whether you ran this test and got the same or a different output.

Update #2

Please look carefully at the output you claim to be obtaining:

string(6) "'"

That's not a string with 6 characters. You are not looking at the real output: you are looking at the output as rendered by a browser. I'm pretty sure you are getting the expected result, i.e. string(6) "&#039;". If you render &#039; with a web browser it becomes '. Use the View Source menu in your browser to see the real output.

Álvaro González
  • 142,137
  • 41
  • 261
  • 360
  • Hi, it's not coming from Microsoft Word, it's data being retrieved from database and i typed it it myself when testing on keyboard, no copying or pasting. Also not sure why you say i should use htmlentities() as htmlspecialchars() should work according to PHP manual. – PHPLOVER Jan 18 '11 at 10:36
  • `htmlentities()` converts most non-ASCII chars into HTML entities while `htmlspecialchars()` only converts five specific characters. Are you sure your quotes are actual 7-bit ASCII quotes? How have you tested it? – Álvaro González Jan 18 '11 at 12:02
  • Hi i just basically typed the quotes (single/double) in textarea using the keyboard. It's not complicated code and nothing that could be causing it a problem. I then created a blank page with a simple form and tried but again quotes not being converted. I also did not use the function and just tried by wrapping htmlspecialchars($string, ENT_QUOTES) and still only converting the following 3 characters < > & . I tried htmlentities but it does not convert < > & for me and could be a problem as data needs to be cleansed on being displayed from database. – PHPLOVER Jan 19 '11 at 10:30
  • Hi Alvaro, much appreciate your time and help. This is what it printed out to screen by typing in a single quote string(6) "'" string(2) "27" string(6) "'" string(2) "27" Thanks – PHPLOVER Jan 19 '11 at 12:51
  • Is suggested updating the question because you can format the output to make it readable, you could post the **exact code** you ran and the information is relevant for everyone. Never mind, see my second update. – Álvaro González Jan 19 '11 at 13:08
  • Hi Alvaro. Viewing source shows the same thing. I did this anyway view source obviously as browser will show it as it meant to but view source will show it in it's safe form. string(6) "'" string(2) "27" string(6) "'" string(2) "27" Thanks – PHPLOVER Jan 19 '11 at 14:19
  • 1
    Hi Alvaro please read comment above. I have just noticed that it is actually working :O Reason why is i view source code using Firebug in FireFox and Firebug does not show it like if you go to "View Source" in on browser menu. I am awful sorry but never knew that Firebug worked showed it as webpage showed it, i thought it would have showed it like when you view source as if you done it via browser itself. – PHPLOVER Jan 19 '11 at 14:34
  • 1
    Well, yes, Firebug's `HTML` tab decodes HTML entities (I hadn't noticed it either). Anyway, I'm glad you got it working. At least you've learnt a couple of debugging tips :) – Álvaro González Jan 19 '11 at 15:06
  • Yeah and thanks to your hard work. Much Appreciate your time and effort :) – PHPLOVER Jan 19 '11 at 15:23
  • Update 2 was helpful. Thanks for pointing that out. I was missing it too. – Rich Nov 26 '20 at 06:50
4

When you view sourcecode using Firebug, Firebug shows it like the web browser displays it, I thought it would have shown the source code the same as if you went to View Source in Browser Menu Bar. A headache learnt and will be remembered. Thanks everyone for your valuable time and input.

halfer
  • 19,824
  • 17
  • 99
  • 186
PHPLOVER
  • 7,047
  • 18
  • 37
  • 54
  • Ran into the same issue. Testing it through FireFox view source is a total dud with `htmlentities($str,ENT_QUOTES, "UTF-8");` and the single quote appears as a single quote. If you have Chris Pedrick's Web Developer extension and do the view source through that, you see the single quote display as `'`. Moral of the story, use a raw view source that doesn't purtify your page HTML so you can tell things are working. – Fiasco Labs Jan 27 '13 at 17:58
1

Had the same problem. My database is with utf-8_unicode_ci and my html charset utf-8, and htmlentities only converted everything but quotes. I thought that having same charset in both db and html would work fine, but it didn't. So I changed the charset on the html to iso-8859-1 and it worked. I don't know why, but it worked. My db is still with utf-8_unicode_ci.

Carolina
  • 191
  • 3
1

Not sure if this will make any difference but have you tried removing the $htmlsanitize.

function htmlsan($htmlsanitize){
    return htmlspecialchars($htmlsanitize, ENT_QUOTES, 'UTF-8');
}
Matt Lowden
  • 2,586
  • 17
  • 19
  • Hi Matt, Thanks for replying much appreciate it, for some reason it still won't convert single and double quotes yet it will convert < > & – PHPLOVER Jan 18 '11 at 10:07
0

Using

htmlentities($htmlsin, ENT_QUOTES, 'UTF-8');

or

mb_convert_encoding($htmlsan, "HTML-ENTITIES", "UTF-8");

Would probably do what you want them to.

Fiasco Labs
  • 6,457
  • 3
  • 32
  • 43
Dai
  • 1,510
  • 1
  • 11
  • 12
  • Hi, first option is what i already have and second option does not convert possible malicious tags like javscript etc to safe entity like < > . – PHPLOVER Jan 18 '11 at 10:46