3

I am struggling to merge two JSON arrays with jq, because I would like to remove duplicate keys in merged objects:

Edit: added a second key since the example was too simple.

file1.json :

[
  {"a": 1, "value": 11},
  {"b": 2},
  {"c": 3}
]

file2.json :

[
  {"a": 4, "value": 44},
  {"b": 5},
  {"d": 6}
]

Expected result:

[
  {"a": 4, "value": 44},
  {"b": 5},
  {"c": 3},
  {"d": 6}
]

jq add file1.json file2.json duplicates the keys (I have two objects with key "a" in the array).

I tried many answers from the web, but everybody has his own use case and none worked directly. The closest is this one: JQ - Merge two arrays but I can't manage to make it work with files instead of string arguments.

My last attempt was

jq \
  --slurpfile base file1.json \
  --slurpfile params file2.json \
  '$base + $params | unique_by(.Key)'
JulienD
  • 7,102
  • 9
  • 50
  • 84

2 Answers2

4

[This answer has been edited to reflect the change in the Q.]

The following solution uses INDEX/2 as defined in https://github.com/stedolan/jq/blob/master/src/builtin.jq One advantage of using INDEX is that it avoids using group_by, which entails the costs of sorting, which may not be desired in any case.

In case your version of jq does not have INDEX/2, here is its definition:

def INDEX(stream; idx_expr):
  reduce stream as $row ({};
    .[$row|idx_expr|
      if type != "string" then tojson
      else .
      end] |= $row);

This filter (INDEX/2) constructs a dictionary, with keys equal to the distinct values of idx_expr applied to elements of the stream, such that the value associated with a particular key is the last item in the stream mapping to that value.

[INDEX( add[] | to_entries; (.[0] | .key) )[]
 | from_entries ]

Invocation:

jq -scf program.jq file1.json file2.json

Output:

[{"a":4,"value":44},{"b":5},{"c":3},{"d":6}]
peak
  • 105,803
  • 17
  • 152
  • 177
1

jq solution:

jq --slurpfile file2 file2.json \
'. + $file2[] | map(to_entries) | flatten 
 | group_by(.key) | map(.[-1] | {(.key): .value})' file1.json

The output:

[
  {
    "a": 4
  },
  {
    "b": 5
  },
  {
    "c": 3
  },
  {
    "d": 6
  }
]
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
  • While it works with the simplified case I used as an example, as soon as one of the objects has a second key, say `{"a": 1, "t": 10}`, it will create a new object with that key in the result: `[{"a": 1}, {"t": 10}]`. – JulienD Jun 12 '18 at 20:01