2

I have a json string with SUBSTITUTE () utf-8 character. I'm getting parsing exception when I try to convert json string to java object using jackson. Can you please let me know how to encode and decode utf-8 characters ?

ObjectMapper mapper = new ObjectMapper();
mapper.readValue(jsonString, MY_DOMAIN_OBJECT.class);

jsonString:

{"studentId":"753253-2274", "information":[{"key":"1","value":"Get alerts on your phone(SUBSTITUTE character is present here. Unable to paste it)To subscribe"}]}

enter image description here

Error:

Illegal unquoted character ((CTRL-CHAR, code 26)): has to be escaped using backslash to be included in string value
Tech Guy
  • 417
  • 2
  • 7
  • 23
  • Can you please post the actual error? – Federico klez Culloca Apr 08 '19 at 15:18
  • Have you verified that it's really the `\u001A` that causes trouble? Couldn't it be some mismatch between your domain class and the JSON structure instead? – Mena Apr 08 '19 at 15:19
  • Updated error code. Its not an issue with mismatch. @Mena – Tech Guy Apr 08 '19 at 15:31
  • 1
    Ah, of course. It's a control character. In JSON syntax you have to escape it. The error text actually tells you that, though technically not with a backslash but with a full unicode escape sequence. – RealSkeptic Apr 08 '19 at 15:35
  • Have you tried [this](https://stackoverflow.com/a/43543664/9050921)? Seems the problem is the same. – Rub Apr 08 '19 at 15:37
  • Yes. I tried and I see the same exception @RubenYedigaryan – Tech Guy Apr 08 '19 at 15:41
  • Can you please give an example ? @RealSkeptic – Tech Guy Apr 08 '19 at 15:42
  • No, not really. I don't know where you're getting that JSON string from and thus, I don't know how you can change it. – RealSkeptic Apr 08 '19 at 15:46
  • If at all possible, you should notify the source of that JSON that it is not producing valid JSON. From [the JSON specification](https://tools.ietf.org/html/rfc7159.html#section-7): “A string begins and ends with quotation marks. All Unicode characters may be placed within the quotation marks, except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).” – VGR Apr 08 '19 at 16:01
  • Well, according to [json.org](http://www.json.org/) strings may not contain control characters. – MC Emperor Apr 08 '19 at 16:01

1 Answers1

4

Can you try this?

ObjectMapper mapper = new ObjectMapper();
mapper.configure(JsonParser.Feature.ALLOW_UNQUOTED_CONTROL_CHARS, true);
mapper.readValue(jsonString, MY_DOMAIN_OBJECT.class);

I hope it helps you: Javadoc

Feature that determines whether parser will allow JSON Strings to contain unquoted control characters (ASCII characters with value less than 32, including tab and line feed characters) or not. If feature is set false, an exception is thrown if such a character is encountered. Since JSON specification requires quoting for all control characters, this is a non-standard feature, and as such disabled by default.

Madplay
  • 1,027
  • 1
  • 13
  • 25
  • This converts my SUBSTITUTE symbol to java encoding (\u001A). But I want to convert to html encoding () since I need to use this information on HTML page. Is there any way to achieve this? @madplay http://www.fileformat.info/info/unicode/char/001a/index.htm – Tech Guy Apr 08 '19 at 18:45
  • Worked for me thanks. Looks like ALLOW_UNQUOTED_CONTROL_CHARS is deprecated, use JsonReadFeature.ALLOW_UNESCAPED_CONTROL_CHARS.mappedFeature() instead. – Richard Feb 14 '23 at 00:46