When outputting a string in HTML, one must escape special characters as HTML entities ("&<>" etc.) for understandable reasons.
I've examined two Java implementations of this: org.apache.commons.lang.StringEscapeUtils.escapeHtml(String) net.htmlparser.jericho.CharacterReference.encode(CharSequence)
Both escape all characters above Unicode code point 127 (0x7F), which is effectively all non-English characters.
This behavior is fine, but the strings it produces are non-human-readable when the characters are non-English (for example, in Hebrew or Arabic). I've seen that when chars above Unicode 127 aren't escaped like this, they still render correctly in browsers - I believe this is because the html page is UTF-8 encoded and thus these characters are understandable to the browser.
My question: Can I safely disable escaping Unicode characters above code point 127 when escaping HTML entities, provided my web page is UTF-8 encoded?