JSON Data Optimization by removing repeated column names

Question

I have a basic Json question - I have a JSON file. Every object in this file has columns repeated.

[
  {
    id: 1,
    name: "ABCD"
  },
  {
    id: 2,
    name: "ABCDE"
  },
  {
    id: 3,
    name: "ABCDEF"
  }
]

For optimization I was thinking to remove repeated column names.

{
    "cols": [
        "id",
        "name"
    ],
    "rows": [
        [
            "1",
            "ABCD"
        ],
        [
            "2",
            "ABCDE"
        ]
    ]
}

What I am trying to understand is - is this a better approach? Are there any disadvantages of this format? Say for writing unit tests?

Wasif Hossain · Answer 1 · 2014-02-27T06:15:55.863

EDIT

The second case (after your editing) is valid json. You can derive it to the following class using json2csharp

public class RootObject
{
    public List<string> cols { get; set; }
    public List<List<string>> rows { get; set; }
}

The very important point to note about a valid json is that it has no other way but to repeat the column names (or, keys in general) to represent values in json. You can test the validity of your json putting it @ jsonlint.com

But if you want to optimize json by compressing it using some compression library like gzip (likewise), then I would recommend Json.HPack.

According to this format, it has many compression levels ranging from 0 to 4 (4 is the best).

At compression level 0:

you have to remove keys (property names) from the structure creating a header on index 0 with each property name. Then your compressed json would look like:

[
  [
    "id",
    "name"
  ],
  [
    1,
    "ABCD"
  ],
  [
    2,
    "ABCDE"
  ],
  [
    3,
    "ABCDEF"
  ]
]

In this way, you can compress your json at any levels as you want. But in order to work with any json library, you must have to decompress it to valid json first like the one you provided earlier with repeated property names.

For your kind information, you can have a look at the comparison between different compression techniques:

enter image description here

I cannot have a JSON format without repeating column names? The JSON data I mentioned in option 2, is a valid JSON per JSONlint. { "cols": [ "id", "name" ], "rows": [ [ "1", "ABCD" ], [ "2", "ABCDE" ] ] } — user1401472, Feb 27 '14 at 05:57

score 1 · Answer 2 · answered Feb 27 '14 at 05:21

1

{
   "cols": [
       "id",
       "name"
   ],
   "rows": [
       "1",
       "ABCD"
   ], [
       "2",
       "ABCDE"     
   ], [
       "3",
       "ABCDEF"
  ]
}

In this approach it will be hard to determine which value stand for which item (id,name). Your first approach was good if you use this JSON for communication.

answered Feb 27 '14 at 05:21

Rashad

11,057
4
45
73

Thanks. Any thoughts from unit testing perspective? Any thing that would restrict me from writing unit tests if I use option 2? – user1401472 Feb 27 '14 at 05:59
Lets consider "2", "ABCDE" which is "id" or "name"? If it is fixed that id will be number and name will never be a number then you can do it. but it will take some more programming in that end where you will send this JSON Data. – Rashad Feb 27 '14 at 06:05

UserOfStackOverFlow · Answer 3 · 2021-10-11T19:50:10.337

1

A solution for it, is use any type (by your preference) of Object-Relational-Mapper,

By that, you can compress your JSON data and still using legible structure/code.

Please, see this article: What is "compressed JSON"?

edited Oct 11 '21 at 19:50

answered Oct 11 '21 at 19:38

UserOfStackOverFlow

108
1
3
14

JSON Data Optimization by removing repeated column names

3 Answers3

EDIT