-1

I use htmlspecialchars() to escape some text, but for characters like it outputs nothing. I know this is because it isn't valid UTF-8 but how can I let htmlspecialchars() ignore that, store it in a database and then display the characters on a webpage?

Jonan
  • 2,485
  • 3
  • 24
  • 42
  • 1
    Please enable PHP error reporting to the highest level, enable error logging, re-run your script and watch in the error log for which warnings are given. – hakre Apr 20 '14 at 17:19
  • 1
    You can pass a encoding argument to `htmlspecialchars` – Musa Apr 20 '14 at 17:19
  • 1
    I cannot reproduce that in php 5.2 - php 5.5, see http://codepad.viper-7.com/L8qu38 – jeroen Apr 20 '14 at 17:49
  • What is the character encoding of the input string if it is not UTF-8? And can you please provide an example line of code with the `htmlspecialchars()` call incl. all parameters you use? – hakre Apr 21 '14 at 22:22

1 Answers1

0

The specific flag you would use is ENT_IGNORE for what you're asking, however, the PHP docs recommend against this flag. You should be replacing it.

Take a look at the flags to determine which one you need: http://php.net/htmlspecialchars

Anthony Calandra
  • 1,429
  • 2
  • 11
  • 17
  • but if I use `ENT_IGNORE` it'll just skip those characters, right? what I want is that those characters just stay as they are – Jonan Apr 20 '14 at 17:35
  • @Jonan why wouldn't you want to convert them to their entity equivalents? Anyway, it seems I misread your post but for what you're asking I don't think any of the flags will ignore them - they would either discard or convert to a special character. – Anthony Calandra Apr 20 '14 at 17:44
  • if I `ENT_SUBSTITUTE`, `™` will be converted to `�`. I f I use `ENT_IGNORE` it'll just be ignored – Jonan Apr 20 '14 at 17:46
  • @Jonan is your encoding parameter set to UTF-8? Remember this flag will convert them to FFFD; so to convert them to their character entities, you will probably want to use htmlentities(). – Anthony Calandra Apr 20 '14 at 17:50
  • yes the encoding parameter is `UTF-8`. `htmlentities()` returns the same as `htmlspecialchars()` – Jonan Apr 20 '14 at 17:56
  • @Jonan hmmm, well you must be using PHP 5.4.0+ for ENT_SUBSTITUTE to be available and I can't reproduce this in any of those versions I have access to... – Anthony Calandra Apr 20 '14 at 18:19
  • Is there a way to change these non-`UTF-8` characteres to a `�`? Because I rather have that than `�` – Jonan Apr 20 '14 at 20:28
  • @Jonana this could be done with htmlspecialchars/htmlentities with the ENT_SUBSTITUTE flag. – Anthony Calandra Apr 20 '14 at 20:37
  • I get `�` when I use the `ENT_SUBSTITUTE` flag – Jonan Apr 21 '14 at 08:37
  • @Jonan I figured, but I can't reproduce this behaviour at all on my machines. – Anthony Calandra Apr 21 '14 at 15:19