10

How to replace \0 (NUL) in the String?

String b = "2012yyyy06mm";               // sth what i want
String c = "2\0\0\0012yyyy06mm";
String d = c.replaceAll("\\\\0", "");    // not work
String e = d.replace("\0", "");          // er, the same
System.out.println(c+"\n"+d+"\n"+e);

String bb = "2012yyyy06mm";
System.out.println(b.length() + " > " +bb.length());  

The above code will print 12 > 11 in console. Oops, What happened?

String e = c.replace("\0", "");
System.out.println(e);      // just print 2(a bad character)2yyyy06mm
user1900556
  • 109
  • 1
  • 1
  • 5

1 Answers1

15

Your string "2\0\0\0012yyyy06mm" does not start 2 {NUL} {NUL} {NUL} 0 1 2, but instead contains 2 {NUL} {NUL} {SOH} 2.

The \001 is treated as a single ASCII 1 character (SOH) and not as a NUL followed by 1 2.

The result is that only two characters are being removed, not three.

I don't think there's any way to represent digits following an abbreviated octal escape other than by breaking the string apart:

String c = "2" + "\0\0\0" + "012yyyy06mm";

or alternately, specify all three digits in the (last) octal escape such that the following digits are not interpreted as being part of the octal escape:

String c = "2\000\000\000012yyyy06mm";

Once you've done that, replacing "\0" as per your line:

String e = c.replace("\0", "");

will work correctly.

Alnitak
  • 334,560
  • 70
  • 407
  • 495
  • Hi, tls for help. String c = "2\0\0\0012yyyy06mm"; String e = c.replace("\0", ""); System.out.println(e); // just print 22yyyy06mm – user1900556 Dec 13 '12 at 10:43
  • 1
    @user1900556 yes, because the `\001` still embedded therein (between the two "2"s) is invisible. The whole point is that the string `c` that you have doesn't contain what you think it does. – Alnitak Dec 13 '12 at 10:44
  • @user1900556 I showed you - either fully pad your octal escapes to three digits, or break the string apart. Either way, you've got to fix the assignment of `c` - the `.replace("\0", ...)` is fine. – Alnitak Dec 13 '12 at 10:50
  • 1
    @user1900556 the `\001` is *one character*, not 2, 3 or 4, it can also be written as `\u0001` which is still one character. Note: space can be written as `\040` or `\u0020` but it is always one character. You cannot replace just part of the character, and you cannot determine how it was defined either. – Peter Lawrey Dec 13 '12 at 10:57