2

Vaadin framework has this useful RichTextArea component. It is however possible for users to insert harmful javascript for example into this field so the value of the field should be sanitized before saving.

What would be the Vaadin way of doing this? Book of Vaadin only mentions that the field "should be sanitized" yet doesn't give a hint of how to actually do it. Asking in the forums a week ago didn't get any replies.

I don't want to add anymore libraries to the project for this purpose. How would one go on about making his own RichTextArea sanitizer in Java with or without Vaadin?

Steve Waters
  • 3,348
  • 9
  • 54
  • 94
  • This depends on where you will output the entered values. If you "display" them via the RichText or Label component, then you don't have to do anything, since they will escape it correctly when displaying. – André Schild Dec 08 '14 at 07:01
  • It will be shown in the view AND saved in the database as a property of an Entity Bean (it's a message to describe an order). I guess the suitable viewing component in Vaadin for Rich Text input is just a RichTextArea in readOnly mode. – Steve Waters Dec 08 '14 at 08:55
  • In that case you should not need to do anything. Since the vaadin RichTextArea should correctly escape JS stuff. This does not prevent the enduser entering javascript in the form but on display it should not be executed, just displayed as . But be aware that if you display it somewhere in the future with another UI, it migth break then.... – André Schild Dec 08 '14 at 10:57
  • No you are not perfectly save. If the RTE is fed with malicious code, this code will be there in the DOM (but the way this is handled, it wont be executed directly), so this can bridge into a stored XSS. Also writing your own and not using a library would a sure way to get it wrong. – cfrick Dec 09 '14 at 20:30

1 Answers1

5

The easiest approach is to use JSoup, which comes with Vaadin 7 (vaadin-server depends on it). E.g.:

Jsoup.clean(richTextArea.getValue(), Whitelist.simpleText())

See Jsoup.clean

public static String clean(String bodyHtml, Whitelist whitelist)

Get safe HTML from untrusted input HTML, by parsing input HTML and filtering it through a white-list of permitted tags and attributes.

Parameters:

bodyHtml - input untrusted HTML (body fragment)

whitelist - white-list of permitted HTML elements

Returns:

safe HTML (body fragment)

and Whitelist

public class Whitelist extends Object

Whitelists define what HTML (elements and attributes) to allow through the cleaner. Everything else is removed.

Start with one of the defaults:

  • none()
  • simpleText()
  • basic()
  • basicWithImages()
  • relaxed()
cfrick
  • 35,203
  • 6
  • 56
  • 68