0

I need your help with a really simple issue. Article Contents of some pages on my website are stored in MySQL database, and I've applied htmlspecialchars() to it when outputting to the browser, but actually there are legitimate <img src="images/me.jpg"> , which are rendered as plain text, meanwhile they are supposed to be valid images part of the article content.

How can I successfully display the image and at the same time avoid a possible XSS attack and the likes.

Thanks

Emil Vikström
  • 90,431
  • 16
  • 141
  • 175

3 Answers3

0

The common way is to NOT use HTML for this, but your own formatting language like bbcode or Markdown. That way you can easily transform that formatting into HTML and at the same time avoid letting the users input whatever HTML they want.

Emil Vikström
  • 90,431
  • 16
  • 141
  • 175
0

Parse the HTML according to the HTML standard and discard any elements/attributes/attribute values you don't want to keep. Check the src value of every img element to see if it's a valid URL, and if it is, check to see if it actually exists and is a valid image. If not, discard the element.

If you use a proprietary formatting language (e.g. BBCode or Markdown), you should still perform the checks against the value provided for each img element (many of the libraries that parse BBCode, Markdown, etc will perform these checks for you).

0b10011
  • 18,397
  • 4
  • 65
  • 86
0

Use HTMLPurifier - it will remove any scripts, including javascript placed in tag attributes, while preserving (and also well-forming) the HTML code

Maxim Krizhanovsky
  • 26,265
  • 5
  • 59
  • 89