8

I have these statements:

int \u65549 = 9;
System.out.println(\u65549);

This compiles perfectly. And outputs

9

But :

System.out.println(Character.isJavaIdentifierStart(\u65549));

outputs

false


I did some research on this topic. I read the documentation, and it says:

This method cannot handle supplementary characters. To support all Unicode characters, including supplementary characters, use the isJavaIdentifierStart(int) method.

Then I did this:

int x = \u65549;
System.out.println(Character.isJavaIdentifierStart(x));

But even this prints:

false

So, does this mean, that Java is confused over \u65549 being an identifier?

Konstantin Yovkov
  • 62,134
  • 8
  • 100
  • 147
dryairship
  • 6,022
  • 4
  • 28
  • 54
  • @bcsb1001 You dropped the `int \u65549` declaration in both snippets, and that's why you couldn't reproduce the issue. – rhino Apr 04 '16 at 16:00
  • The accepted answer forgot to mentioned your 3rd example. Actually `int \u65549 = 9(you forgot to mention this must appear on top); int X = \u65549; Sop(Character.isJavaIdentifierStart(x));` is same like `int A = 9; int X = A; Sop(Character.isJavaIdentifierStart(X));`, that's means same result as 2nd example `int A = 9; Sop(Character.isJavaIdentifierStart(A));` – 林果皞 Jun 01 '16 at 18:15

2 Answers2

8
int \u65549 = 9;
System.out.println(Character.isJavaIdentifierStart(\u65549));

Here, \u65549 is the name of a variable, that also happens to contain the value 9. It should (and does) do the same as if you wrote:

System.out.println(Character.isJavaIdentifierStart(9));

which prints false, since you can't have a Java identifier starting with a whitespace character (\u0009 is the codepoint for HORIZONTAL TAB, '\t').

Andy Turner
  • 137,514
  • 11
  • 162
  • 243
  • You did see my now-deleted comment in the notifications, didn't you? :) I made an obvious error in my thinking that I soon corrected, but the good thing is I helped make your post even more clear. **Edit:** I just checked the revision history and it seems that you happened to make the same mistake at first. Glad I'm not alone :) – rhino Apr 04 '16 at 16:24
2

\u65549 is interpreted as the unicode character \u6554 , followed by the character 9.

This is a valid syntax in a String .

Other than that, \u65549 is not a valid unicode identifier. A String only takes the valid part (4 characters in the hexadecimal range) when it encounters a unicode prefix (\u), so it takes only the valid identifier part, and obtains a valid character .

Arnaud
  • 17,229
  • 3
  • 31
  • 44