1

I have a JSON file and I am extracting data from it using jq. One simple use case is pulling out any JSON Object that contains an Id which is provided as an argument.

I use the following simple script to do so:

[.[] | select(.id == $ID)]

The script is stored in a separate file (by_id.jq) which I pass in using the -f argument.

The full command looks something like this:

cat ./my_json_file.json | jq -sf --arg ID "8df993c1-57d5-46b3-a8a3-d95066934e5b" ./by_id.jq

Is there a way by only using jq that a comma separated list of values could be passed as an argument to the jq script and iterate through the ids and check them against the value of .id in the the JSON file with the result being the objects that have that id?

For example if I wanted to pull out three objects by their ids I would want to structure the command in this way:

cat ./my_json_file.json | jq -sf --arg ID "8df993c1-57d5-46b3-a8a3-d95066934e5b,1d5441ca-5758-474d-a9fc-40d0f68aa538,23cc618a-8ad4-4141-bc1c-0251y0663963" ./by_id.jq

Daniel McC
  • 483
  • 4
  • 8

3 Answers3

1

Sure. Though you'll need to parse (split) that list of ids to something that jq can work with, such as an array of ids. Then your problem becomes, given an array of keys, select objects that have any of these ids. Which you could use approaches found here.

$ jq --arg ID '8df993c1-57d5-46b3-a8a3-d95066934e5b,1d5441ca-5758-474d-a9fc-40d0f68aa538,23cc618a-8ad4-4141-bc1c-0251y0663963' '
select(.id | IN($ID|split(",")[]))
' ./my_json_file.json

I'm not sure what your input looks like but judging by your use of slurping then filtering the slurped input, it's a stream of objects. The slurping is not necessary here.

Jeff Mercado
  • 129,526
  • 32
  • 251
  • 272
  • This was the best option for me however we are on version 1.5 at the moment which does not include `IN` so I had to come up with a workaround until I can upgrade to 1.6. The solution I used was `[.[] |. as $json | select($ID | split(",")[] as $stringArray | $stringArray | . == $json.id)]` – Daniel McC Aug 12 '19 at 10:56
1

Here is an approach that focuses on efficiency.

Your Q indicates that in fact you have a stream of objects, so the first step towards efficiency is to avoid the -s option, and use -n with inputs instead.

The second step it to avoid splitting your comma-separated string of values more than once.

So your script might look like this:

INDEX($ids | splits(","); .) as $dict
| inputs
| select($dict[.id])

And the invocation would look like this:

jq -n --args a,b,c -f by_id.jq

This of course assumes that simply splitting the string of ids on "," will suffice. You might need to trim the values and take care of other potential anomalies.

peak
  • 105,803
  • 17
  • 152
  • 177
  • Thank you for your answer. I could not use this as we want all our scripts to use the same flags as they are called from Python and the input is passed in an expected format. – Daniel McC Aug 12 '19 at 11:00
  • @DanielMcC - You could still use the second part of this answer but as that might not be clear, I've added a separate answer. – peak Aug 12 '19 at 15:08
0

For efficiency, it would be better to split $ID just once.

So if you have to use the -s option, you could use the following jq program:

INDEX($ID | splits(","); .) as $dict
| .[]
| select($dict[.id])
peak
  • 105,803
  • 17
  • 152
  • 177