1

I have a piece of JSON in UTF-8 that looks like this in Google Chrome (without the new lines):

{"_links": {"self": {"href": "http://bla:8888/1/2/3/2257487e4a750cab"}, 
"it\u0119m": [{"href": "http://bla:8888/1/2/4/8f4fea003fe4c7fb284801d082de34a6"},
{"href": "http://bla:8888/1/2/4/c1213dd511c5427256c81f222e942c28"}]}}

First I remove all spaces for DBXJSON to work. Then I parse and print it, with this result:

{"_links":{"self":{"href":"http://bla:8888/1/2/3/2257487e4a750cab"},
"itęm":[{"href":"http://bla:8888/1/2/4/8f4fea003fe4c7fb284801d082de34a6"},
{"href":"http://bla:8888/1/2/4/c1213dd511c5427256c81f222e942c28"}]}}

That's how I want it, except for the need to remove spaces.

If I use the same JSON string as input to dwsJSON, interesting things happen:

{"_links":{"self":{"href":"http://bla:8888/1/2/3/2257487e4a750cab"},
"it\u0119m":[{"href":"4a6p://bla:8888/1/2/4/8f4fea003fe4c7fb284801d082de3/1."}
{"href":"c28p://bla:8888/1/2/4/c1213dd511c5427256c81f222e942\n\u0000\u0000"}]}}

Unicode literals are not interpreted, \u0000 is all over the place in a bigger file and generally some kind of garbling is going on.

What causes this and where should I look to fix it? TdwsJSONValue.ParseString takes a UnicodeString and my input is a String, but I'm not sure how that matters (kind of lost with all Delphi String types).

Thijs van Dien
  • 6,516
  • 1
  • 29
  • 48
  • 1
    Why do you explicitly use `WideString` in the first place ? If that input is what you're getting from the `TIdHTTP.Get` method and you're in Delphi 2010, use just `string` as a variable for holding the response content. Since Delphi 2009, `string` is mapped to the `UnicodeString`. – TLama Sep 22 '13 at 19:27
  • @TLama It is in fact a `String`; thought that was mapped to `WideString`. Updated. – Thijs van Dien Sep 22 '13 at 19:30
  • 1
    Sorry, I realized that when I checked the output again (and thus I deleted my comment). Well, from a quick view in [`the source`](https://code.google.com/p/dwscript/source/browse/trunk/Source/dwsJSON.pas#480) it seems that only `TdwsJSONParserState.ParseJSONString` method takes care about escaped chars. But where it's being called I'm not sure (and without Delphi by hand it's quite difficult and time consuming to find out). In any case, this smells like a bug to me. String starting with `http` parse to `4a6p` makes not much sense. – TLama Sep 22 '13 at 20:01
  • @TLama Yup, "stable" didn't mean much in this case. Updated to trunk and gone with the problem... I didn't expect it to be so simple, after my earlier fights with unicode. – Thijs van Dien Sep 22 '13 at 20:29

1 Answers1

2

Because of earlier struggles with unicode, I really thought I was doing something wrong. But this simply was a bug. No such problem anymore in SVN trunk at this time.

Thijs van Dien
  • 6,516
  • 1
  • 29
  • 48