-2

Help please, I have to print unicode strings caming from a database (oracle stored procedure => mapping into a java object) into a JSP page (with stuts1), I used this :

String unicodeStr = myBean.getTitle();//return from database the unicode string (something like this : Uygulama g\u00fcvenli\u011fi ile).
String isoString = org.apache.commons.lang.StringEscapeUtils.escapeHtml(unicodeStr);

my problem is that the unicodeStr came with "\\" for each "\" (Uygulama g\\u00fcvenli\\u011fi ile) so my StringEscapeUtils.escapeHtml can not detect unicode caracters like "\u00fc" because of the "\" in the begining.

I tried unicodeStr.replaceAll("\\","\"), but it can't compile since the "\" is not allowed in string without escapement.

Lotfiction
  • 349
  • 3
  • 14

2 Answers2

1

I tried unicodeStr.replaceAll("\","\"), but it can't compile since the "\" is not allowed in string without escapement.

You can replace the double backslashes like this:

System.out.println("Uygulama g\\u00fcvenli\\u011fi ile".replaceAll("\\\\\\\\","\\"));

and it yields:

Uygulama g\u00fcvenli\u011fi ile

You can find an explanation here (see paragraph Regular Expressions, Literal Strings and Backslashes):

In literal Java strings the backslash is an escape character. The literal string "\" is a single backslash. In regular expressions, the backslash is also an escape character. The regular expression \ matches a single backslash. This regular expression as a Java string, becomes "\\". That's right: 4 backslashes to match a single one.

The regex \w matches a word character. As a Java string, this is written as "\w".

The same backslash-mess occurs when providing replacement strings for methods like String.replaceAll() as literal Java strings in your Java code. In the replacement text, a dollar sign must be encoded as \$ and a backslash as \ when you want to replace the regex match with an actual dollar sign or backslash. However, backslashes must also be escaped in literal Java strings. So a single dollar sign in the replacement text becomes "\$" when written as a literal Java string. The single backslash becomes "\\". Right again: 4 backslashes to insert a single one.

linski
  • 5,046
  • 3
  • 22
  • 35
1

If you know that in the database the strings are all stored in the Java escaped variant why don't you simply decode them before escaping them to Html?

import org.apache.commons.lang.StringEscapeUtils;

String unicodeEscapedStr = myBean.getTitle();
String unicodeStr = StringEscapeUtils.unescapeJava(unicodeEscapedStr);
String isoString = StringEscapeUtils.escapeHtml(unicodeStr);
Robert
  • 39,162
  • 17
  • 99
  • 152