0

I have a utf8mb4 database field which has ended up with htmlspecials such as 's

This is from user entered data via a html form. To display this field in laravel blade I use {{ $profile }} but that runs through phps htmlspecialschars feature to prevent xss attack (https://laravel.com/docs/6.x/blade#displaying-data) and so the output I get is 's

I know I can display it as unescaped data using {!! $profile !!} but since this is user entered data there's a risk anything could get output.

Whats the best way to approach this. Is there a way to clean it up at the database layer without losing or corrupting the data. Or is there a better technique at the presentation layer whilst avoiding XSS risks?

Note this is data from a legacy database.

Any help appreciated.

* UPDATE *

I tried using this htmlpurify package: https://github.com/stevebauman/purify which seems to do the trick similar to using htmlspecialchars($value, ENT_QUOTES,'UTF-8',true); e.g:

{{ Purify::clean($value) }} or {{ htmlspecialchars($value, ENT_QUOTES,'UTF-8',true) }}  

However if I have something like the following in the database:

 Jobs & Work  

Then using htmlpurify or htmlspecialchars as in the example above still ends up as:

 Jobs & Work
adam78
  • 9,668
  • 24
  • 96
  • 207
  • 1
    Since it's legacy data, I'd run a script to convert these back to unescaped in the database. Barring that, something like HTML Purifier. – ceejayoz Jan 16 '20 at 21:03
  • Just for info do you get same result with this ? `htmlspecialchars($value, ENT_QUOTES,'UTF-8',true);` –  Jan 16 '20 at 21:55
  • @Dilek see my update post above. I'm still getting `&` for `&` – adam78 Jan 16 '20 at 22:33
  • @adam78 I see thanks! I had same problem when I set charset to utf8mb4 in my connection, because my database was utf8 or utf8_general_ci and columns were swedish something dont remember. I couldnt find a solution auto convert that characters to my needs, So I deleted all of them manualy in notpad++ and converted all that table and columns into utf8mb4 and then update with edited content, it works fine now. You know the way, I just wanted to share. –  Jan 17 '20 at 06:27
  • Please provide an example of the text -- both as shown and via HEX(...). – Rick James Jan 20 '20 at 18:27

1 Answers1

0

Output as unescaped and then run through Purify seems to fix e.g.:

{!! Purify::clean($value) !!}

as opposed to :

{{ Purify::clean($value) }} 
adam78
  • 9,668
  • 24
  • 96
  • 207