25

Consider this JSON response:

[{
    Name: 'Saeed',
    Age: 31
}, {
    Name: 'Maysam',
    Age: 32
}, {
    Name: 'Mehdi',
    Age: 27
}]

This works fine for small amount of data, but when you want to serve larger amounts of data (say many thousand records for example), it seems logical to prevent those repetitions of property names in the response JSON somehow.

I Googled the concept (DRYing JSON) and to my surprise, I didn't find any relevant result. One way of course is to compress JSON using a simple home-made algorithm and decompress it on the client-side before consuming it:

[['Name', 'Age'], 
['Saeed', 31], 
['Maysam', 32], 
['Mehdi', 27]]

However, a best practice would be better than each developer trying to reinvent the wheel. Have you guys seen a well-known widely-accepted solution for this?

Saeed Neamati
  • 35,341
  • 41
  • 136
  • 188
  • JSON is a data structure so it doesn't really fall under DRY. – JJJ Dec 30 '12 at 07:29
  • 7
    The redundancy inherent in this kind of JSON compresses very well if gzip is used. You probably already knew that, but just in case it is found out that no such generally accepted technique for writing compact JSON documents exists, this is probably why. :) – Ray Toal Dec 30 '12 at 07:30
  • 2
    Your "home-made" idea is a good start. Search for "JSON compression" instead, you'll find several ideas, such as [HPack](http://stackoverflow.com/questions/11774375/json-compression-for-transfer). – DCoder Dec 30 '12 at 07:30
  • Just saw [MsgPack](http://msgpack.org/) on HN, you could try that. – Waleed Khan Dec 30 '12 at 07:57
  • I want to suggest that if you are transferring signification Data in JSON in one-shot, you should consider sending it as 'pages' or 'chunks' instead of trying to manage the compress-ability of the whole set; I'm sure there are lots of reasons why the compression question is useful, but you may be trying to solve the problem the wrong way, and ultimately introducing more faults than handling data in a more compact or page/object-oriented way. – Ben West Dec 30 '12 at 15:12

5 Answers5

15

First off, JSON is not meant to be the most compact way of representing data. It's meant to be parseable directly into a javascript data structure designed for immediate consumption without further parsing. If you want to optimize for size, then you probably don't want self describing JSON and you need to allow your code to make a bunch of assumptions about how to handle the data and put it to use and do some manual parsing on the receiving end. It's those assumptions and extra coding work that can save you space.

If the property names and format of the server response are already known to the code, you could just return the data as an array of alternating values:

['Saeed', 31, 'Maysam', 32, 'Mehdi', 27]

or if it's safe to assume that names don't include commas, you could even just return a comma delimited string that you could split into it's pieces and stick into your own data structures:

"Saeed, 31, Maysam, 32, Mehdi, 27"

or if you still want it to be valid JSON, you can put that string in an array like this which is only slightly better than my first version where the items themselves are array elements:

["Saeed, 31, Maysam, 32, Mehdi, 27"]

These assumptions and compactness put more of the responsibility for parsing the data on your own javascript, but it is that removal of the self describing nature of the full JSON you started with that leads to its more compact nature.

jfriend00
  • 683,504
  • 96
  • 985
  • 979
  • 1
    It's worth noting that the last example is not technically a JSON document, because JSON documents must have an array or map as the root object. – Dietrich Epp Dec 30 '12 at 09:41
  • @DietrichEpp - good point - I added an option where the string is wrapped in an array too. The choice among these options depends upon how much the OP wants to stick with legal JSON vs. optimize the bytes transferred. – jfriend00 Dec 30 '12 at 19:55
10

One solution is known as hpack algorithm

https://github.com/WebReflection/json.hpack/wiki

Yubin Bai
  • 158
  • 1
  • 4
7

You might be able to use a CSV format instead of JSON, as you would only specify the property names once. However, this would require a rigid structure like in your example.

JSON isn't really the kind of thing that lends itself to DRY, since it's already quite well-packaged considering what you can do with it. Personally, I've used bare arrays for JSON data that gets stored in a file for later use, but for simple AJAX requests I just leave it as it is.

DRY usually refers to what you write yourself, so if your object is being generated dynamically you shouldn't worry about it anyway.

Niet the Dark Absol
  • 320,036
  • 81
  • 464
  • 592
5

Use gzip-compression which is usually readily built into most web servers & clients?

It will still take some (extra) time & memory to generate & parse the JSON at each end, but it will not take that much time to send over the network, and will take minimal implementation effort on your behalf.

Might be worth a shot even if you pre-compress your source-data somehow.

Macke
  • 24,812
  • 7
  • 82
  • 118
1

It's actually not a problem for JSON that you've often got massive string or "property" duplication (nor is it for XML).

This is exactly what the duplicate string elimination component of the DEFLATE-algorithm addresses (used by GZip).

While most browser clients can accept GZip-compressed responses, traffic back to the server won't be.

Does that warrant using "JSON compression" (i.e. hpack or some other scheme)?

  1. It's unlikely to be much faster than implementing GZip-compression in Javascript (which is not impossible; on a reasonably fast machine you can compress 100 KB in 250 ms).

  2. It's pretty difficult to safely process untrusted JSON input. You need to use stream-based parsing and decide on a maximum complexity threshold, or else your server might be in for a surprise. See for instance Armin Ronacher's Start Writing More Classes:

    If your neat little web server is getting 10000 requests a second through gevent but is using json.loads then I can probably make it crawl to a halt by sending it 16MB of well crafted and nested JSON that hog away all your CPU.

malthe
  • 1,237
  • 13
  • 25