0

i use jsonnet tool to convert json

 "{a:\"李\"}" 

result is

{
   "name": "\u00c0\u00ee"
}

why chinese is convert to two Unicode?

Misha Brukman
  • 12,938
  • 4
  • 61
  • 78
Eric
  • 1
  • 1
  • Input/output encoding do not agree? – Dummy00001 Jun 02 '16 at 12:31
  • 2
    The title doesn't make sense. Chinese is a language, Unicode is an character set. Chinese is usually encoded in GB18030, UTF-8, UTF16 or UTF-32. The latter details don't really matter for JSON, which uses `\uXXXX`. But yes, you'd expect `李` to be `\u674E` or `\uE1F9` (the latter seems to be a compatibility encoding) – MSalters Jun 02 '16 at 12:31
  • Eric, may I suggest reading [this article](http://kunststube.net/encoding/) on encode. Mostly, some system, depending on its origin may use a different encode system then what is commonly agreed upon. On PHP, encodes are not even used, since Strings are treated as a byte sequence, in Java, some operations are enforced, to ensure that a document is "encoded by a contract", and in C, I have seen behaviors that allow several encodes at once, on a single file, that have different encoded Strings in the same file. – Bonatti Jun 02 '16 at 12:49
  • It is not Unicode, it got converted to code page 936, aka "Simplified Chinese GBK". A double-byte encoding that dates from the previous century, the set of glyphs that start with 0xc0 as the first byte [is here](https://msdn.microsoft.com/en-us/goglobal/gg650628). As you can tell, 0xee as the 2nd byte matches 李. You'll have to whack the tool over the head so it joins the 21st century, "jsonnet" does not help us help you. – Hans Passant Jun 02 '16 at 13:46
  • Jsonnet expects Unicode. Probably the input file is encoded using code page 936 and jsonnet treats the bytes as unicode. Definitely nothing is converted by jsonnet. – sbarzowski Sep 25 '17 at 19:36

1 Answers1

1

It works for me. I suspect you are not giving Jsonnet UTF-8 (which is required). You can use iconv to convert textual data to UTF-8.

$ jsonnet -e  "{a:\"李\"}" 
{
   "a": "李"
}
Muhammad Omer Aslam
  • 22,976
  • 9
  • 42
  • 68