0

I'm trying to figure out how to get utf16 characters in my json string. This is my json:

ourWideStringClass jsonNoError("{    \
    \"type\" : \"ROI\", \
    \"m_position\" : { \
        \"m_x\" : 16, \
        \"m_y\" : 32, \
        \"m_z\" : 0 \
    }, \
    \"m_size\" : { \
        \"m_width\" : 64, \
        \"m_height\" : 128, \
        \"m_depth\" : 3 \
    } \
    } ");

So if I have the following utf16, I'm not sure how to put that in the above. I know I can't put 0x9CE5 in the json, like by replacing 128 with 0x9CE5. I'm not sure what to do. Any ideas?

I was reading encoding U. I'm not sure if I can use the escape U in my json string or how to do that. Would it be \U"m_x" ?

Alex Shesterov
  • 26,085
  • 12
  • 82
  • 103
Michele
  • 3,617
  • 12
  • 47
  • 81

1 Answers1

0

If you need to have the U+9CE5 character as a part of a string within JSON, you can do so by encoding it as \u9ce5, e.g.:

"key": "before\u9ce5after"

Alternatively, you can include this character unencoded in your JSON.

Please refer to json.org for JSON syntax, e.g.:

"key": "before鳥after"

Please note that the term "UTF-16 character" is not correct. UTF-16 doesn't define any characters — it's a specification how to encode Unicode characters. The characters themselves (or codepoints) are defined by the Unicode standard, and a whole string of Unicode characters may be encoded as UTF-16, UTF-8 or via other encoding schemes.

So all of your JSON is in some encoding (e.g. in UTF-16), not just separate characters.

Alex Shesterov
  • 26,085
  • 12
  • 82
  • 103
  • Thanks! I'm planning on processing the entire json string as wchar_t*, read into a wstream. I just need to test with an actual utf16 part in it. – Michele Apr 09 '17 at 15:56
  • If I'm putting it in this line, do you think I'm doing the utf16 char correctly? It's an escaped c++ string. "{\"type\":\"string\",\"value\":\"\u9ce5\"},\n" – Michele Apr 09 '17 at 16:11
  • when I do wcout for my json, it's showing ? for u9ce5. Maybe there's something wrong? – Michele Apr 09 '17 at 16:27
  • 1
    `"{\"type\":\"string\",\"value\":\"\u9ce5\"}` - this should result in a valid JSON, but note that C++ interprets `\u9ce5` when parsing the string literal. So the actual JSON string will contain this character *unencoded*. If you'd like JSON string to actually contain the escaped character, you need to escape the backslash by doubling it, i.e. `"{\"type\":\"string\",\"value\":\"\\u9ce5\"}` (note the doubled `\` before `u`). – Alex Shesterov Apr 09 '17 at 16:41
  • Thanks for the help. I have another question here if you're interested. http://stackoverflow.com/questions/43309439/json-string-with-utf16-char-cannot-convert-from-const-char-566-to-stdbasi – Michele Apr 09 '17 at 16:59
  • @Alexshersterov - BTW, when I try this: L"{\"type\":\"string\",\"value\":L\"\\u9CE5\"},\n" the json isn't valid. When I take out the extra L before \"\\u9ce5, cout-ing the json shows "\u9CE5", but I would expect it to show the Japanese character. I have the extra slash to escape the \u as you can see. – Michele Apr 09 '17 at 19:18