4

I have this String

String x="String containning special chars  \u202C \n  \u202C  \u202C  \u202C";

How can I print out this: String containning special chars \u202C \n \u202C \u202C \u202C ?

Tried

System.out.println(x.replace("\\","\\\\"));

But that only prints String containning special chars ‬ \n ‬ ‬ ‬

Also tried

String out = org.apache.commons.lang3.StringEscapeUtils.unescapeJava(x);
System.out.println(out);

But that also doesn't help.

Any one with a suggestion or an API that I am not aware of?

UPDATE - SOLUTION

Following @lbear aproach I came up with this functions that deals most cases of escaped Strings

public static String removeUnicodeAndEscapeChars(String input) {
    StringBuilder buffer = new StringBuilder(input.length());
    for (int i = 0; i < input.length(); i++) {
        if ((int) input.charAt(i) > 256) {
            buffer.append("\\u").append(Integer.toHexString((int) input.charAt(i)));
        } else {
            if (input.charAt(i) == '\n') {
                buffer.append("\\n");
            } else if(input.charAt(i) == '\t'){
                buffer.append("\\t");
            }else if(input.charAt(i) == '\r'){
                buffer.append("\\r");
            }else if(input.charAt(i) == '\b'){
                buffer.append("\\b");
            }else if(input.charAt(i) == '\f'){
                buffer.append("\\f");
            }else if(input.charAt(i) == '\''){
                buffer.append("\\'");
            }else if(input.charAt(i) == '\"'){
                buffer.append("\\");
            }else if(input.charAt(i) == '\\'){
                buffer.append("\\\\");
            }else {
                buffer.append(input.charAt(i));
            }
        }
    }
    return buffer.toString();
}
MaVRoSCy
  • 17,747
  • 15
  • 82
  • 125

3 Answers3

4

There is the Apache Commons StringEscapeUtils which has HTML encoding. This encoding is pretty close to what you may need

String escaped code = StringEscapeUtils.escapeHtml(rowId)

See doc

Brig
  • 10,211
  • 12
  • 47
  • 71
3

Use Integer.toHexString((int)x.charAt(34));, you can get the string of the unicode char, and add \\u before it, you will get the String.

public static String removeUnicode(String input){
    StringBuffer buffer = new StringBuffer(input.length());
    for (int i =0; i < input.length(); i++){
        if ((int)input.charAt(i) > 256){
        buffer.append("\\u").append(Integer.toHexString((int)input.charAt(i)));
        } else {
            if ( input.charAt(i) == '\n'){
                buffer.append("\\n");
            } else {
                buffer.append(input.charAt(i));
            }
        }
    }
    return buffer.toString();
}
Daniel Causebrook
  • 469
  • 1
  • 8
  • 20
lbear
  • 790
  • 1
  • 9
  • 16
  • so what exactly should I supply to `System.out.print()` to get the desired result? – MaVRoSCy Jun 14 '13 at 08:22
  • What you did here is you processed unicode characters and newline. what about all the others? This is basically the same as the others answer it just processes unicode characters all together that is almost okay, but there MUST be a general solution, remember all that needs to be done is to replace \ with \\ – Peter Jaloveczki Jun 14 '13 at 08:43
  • @MaVRoSCy I have fixed it. Please try it again. – lbear Jun 14 '13 at 08:45
  • @MaVRoSCy a Unicode character is regarded as one single char. When we use `replace()`, it can find \ leading the unicode char. – lbear Jun 14 '13 at 08:49
-1
String original = "String containning special chars  \u202C \n  \u202C  \u202C  \u202C";
String escaped = original.replace("\u202C", "\\u202C");
System.out.println(escaped);
JB Nizet
  • 678,734
  • 91
  • 1,224
  • 1,255