When the JSON spec talks about Unicode characters, it really means Unicode codepoints. Every valid \uXXXX
sequence represents a valid codepoint, as \uXXXX
can represent codepoints up to U+FFFF but Unicode defines codepoints all the way up to U+10FFFF.
When not using escaped hex notation, the full range of Unicode codepoints can be used as-is in JSON. On the other hand, when using escaped hex notation, only codepoints up to U+FFFF are allowed. This is OK though, because codepoints above U+FFFF must be represented using UTF-16 surrogate pairs, which consist of 2 codepoints that both fit in the \uXXXX
range acting together. This is described in RFC 7159 Section 7 Strings:
Any character may be escaped. If the character is in the Basic
Multilingual Plane (U+0000 through U+FFFF), then it may be
represented as a six-character sequence: a reverse solidus, followed
by the lowercase letter u, followed by four hexadecimal digits that
encode the character's code point. The hexadecimal letters A though
F can be upper or lower case. So, for example, a string containing
only a single reverse solidus character may be represented as
"\u005C".
...
To escape an extended character that is not in the Basic Multilingual
Plane, the character is represented as a 12-character sequence,
encoding the UTF-16 surrogate pair. So, for example, a string
containing only the G clef character (U+1D11E) may be represented as
"\uD834\uDD1E".
So your question should not be "does \uXXXX
correspond to a Unicode character?", because it logically always will as all values 0x0000 - 0xFFFF are valid Unicode codepoints. The real question should be "does \uXXXX
correspond to a Unicode codepoint in the BMP, and if not does it belong to a \uXXXX\uXXXX
pair that corresponds to a valid UTF-16 surrogate?".